SlideShare una empresa de Scribd logo
1 de 54
Descargar para leer sin conexión
Creating an in-house
computerized adaptive testing
(CAT) program with Concerto
Atsushi, MIZUMOTO
(Kansai University)
2013/09/20
JLTA at Waseda University
Computerized Adaptive Testing
CAT needs
Item Response Theory
CTT vs. IRT
Aspect CTT IRT
Test score Ordinal scale Interval scale
Ability estimate Test-dependent Test-independent
Test result Person-dependent Person-independent
Measurement
target (Precision)
All test-takers Individuals
Equating/CAT Difficult Easy
Ohtomo (2009)
CAT Needs IRT
CAT
IRT
IRT
IRT
History of CAT Research
40 years
(Thomson & Weiss, 2011))
30 in LT
(Koyama, 2010))
Example of CAT
Example of CAT
CBT ≠ CAT
How CAT Works
http://www.j-cat.org/page/interpret
Advantages of CAT
•Tailored for individual test-takers
•Shorter test time
•More precision (= SE smaller)
•No need for random sampling
www.geocities.jp/kosugitti/labo/irtnote.pdf
Purposes
•Creating a CAT program
•Evaluation
Creating a CAT Program
•Choosing the CAT System
•Constructing an Item Bank (Pretest)
•Calibrating the Item Bank
•Determine Specifications & Feedback
•Administering the CAT
Creating a CAT Program
•Choosing the CAT System
•Constructing an Item Bank (Pretest)
•Calibrating the Item Bank
•Determine Specifications & Feedback
•Administering the CAT
Moodle Plugin
http://moodle2x.info
1. Free account(150 test takers/month)
2. Amazon Machine Images(Free for a year)
3. Installing it on your own server
•Open-source
•Running R on a server (catR, RMySQL)
•HTML-based
Installation on a server
https://code.google.com/p/concerto-platform/wiki/installation4
Wiki (Resources)
https://code.google.com/p/concerto-platform/wiki/Resources?tm=6
Creating a CAT Program
•Choosing the CAT System
•Constructing an Item Bank (Pretest)
•Calibrating the Item Bank
•Determine Specifications & Feedback
•Administering the CAT
Creating a CAT Program
•Choosing the CAT System
•Constructing an Item Bank (Pretest)
•Calibrating the Item Bank
•Determine Specifications & Feedback
•Administering the CAT
Constructing an Item Bank
(Pretest)
•Vocabulary Test (Mizumoto, 2006)
http://www.mizumot.com/files/VocSizeMeasure.pdf
•Based on SVL 12,000
(Up to 8,000 level; 30 items for each level)
•716 university EFL learners
Sample Question
(1) 心の, 精神の
	

 A.	

essential
	

 B.	

creative
	

 C.	

loose
	

 D.	

mental
Creating a CAT Program
•Choosing the CAT System
•Constructing an Item Bank (Pretest)
•Calibrating the Item Bank
•Determine Specifications & Feedback
•Administering the CAT
Creating a CAT Program
•Choosing the CAT System
•Constructing an Item Bank (Pretest)
•Calibrating the Item Bank
•Determine Specifications & Feedback
•Administering the CAT
Calibrating the Item Bank
•240 items analyzed (Rasch model)
•150 items left for the item bank
•Calibrated with two parameter
logistic model (item difficulty & discrimination)
•Update the csv file to Concerto
Creating a CAT Program
•Choosing the CAT System
•Constructing an Item Bank (Pretest)
•Calibrating the Item Bank
•Determine Specifications & Feedback
•Administering the CAT
Creating a CAT Program
•Choosing the CAT System
•Constructing an Item Bank (Pretest)
•Calibrating the Item Bank
•Determine Specifications & Feedback
•Administering the CAT
Specifications of CAT
•Starting point
(parameters, initial ability, randmized/fixed)
•Ability estimation method
(empirical Bayes and others)
•Stopping rule
(Number of items/Standard error)
•Final ability estimation
Magis and Raîche (2012, p. 7)
How many items for what SE?
•Simulation with catR package
Magis, D., & Raîche, G. (2012).
http://www.jstatsoft.org/v48/i08
True Theta = 1, SE = 0.3
Stopping rule = 30 items
Concerto
http://langtest.jp/concerto/?tid=20
Feedback Page
Creating a CAT Program
•Choosing the CAT System
•Constructing an Item Bank (Pretest)
•Calibrating the Item Bank
•Determine Specifications & Feedback
•Administering the CAT
Creating a CAT Program
•Choosing the CAT System
•Constructing an Item Bank (Pretest)
•Calibrating the Item Bank
•Determine Specifications & Feedback
•Administering the CAT
268 test takers
(university first year)
(1) CAT
(2) Paper-pencil version
(68 items) common person linking
(3) Questionnaire
“What did you think of
the CAT result?”
Evaluation
CAT vs. Paper-pencil
CAT Theta
0 1 2 3 4
-10123
0.92
-1 0 1 2 3
01234
Paper-pencil Theta
n = 268
Random
30Qs
Fixed
68Qs
-1 0 1 2 3
01234
Pape
n = 268
CAT
(30Qs)
M = 1.71
SD = 1.13
P-P
(68Qs)
M = 1.72
SD = 0.95
-1 0 1 2 3
01234
Pape
n = 268
CAT
(30Qs)
M = 1.71
SD = 1.13
P-P
(68Qs)
M = 1.72
SD = 0.95
Mean diff. = -0.02
95% CI [-0.07, 0.04]
d = 0.01
Power = .06
-1 0 1 2 3
01234
Pape
n = 268
CAT SE
(30Qs)
M = 0.39
SD = 0.11
P-P SE
(68Qs)
M = 1.71
SD = 1.13
-1 0 1 2 3
01234
Pape
n = 268
CAT SE
(30Qs)
M = 0.39
SD = 0.11
P-P SE
(68Qs)
M = 1.71
SD = 1.13
Mean diff. of SE
= -1.32
95% CI [-1.44, -1.19]
d = 1.65
Power = 0.99
Evaluation
CAT vs. Paper-pencil
Means: CAT = Paper-pencil
SEs: CAT < Paper-pencil
CAT measures the same ability
with much more precision
(with fewer items).
Evaluation
Questionnaire
Result of the Questionnaire
Frequency
Response
150 100 50 0 50 100 150
Very inaccurate Inaccurate Rather Inaccurate Rather accurate Accurate Very accurate
Feedback Page
Future Research
•More items in the item bank
•Better formula for predicting
other test scores
•Improved feedback
•Collaboration
Summary
•Created a CAT program
•Evaluation
(1) CAT better than Paper-pencil
(2) Feedback needs improvement.

Más contenido relacionado

La actualidad más candente

Computer based test designs (cbt)
Computer based test designs (cbt)Computer based test designs (cbt)
Computer based test designs (cbt)munsif123
 
Test equating using irt. final
Test equating using irt. finalTest equating using irt. final
Test equating using irt. finalmunsif123
 
Test construction
Test constructionTest construction
Test constructionmunsif123
 
Towards a pattern recognition approach for transferring knowledge in acm v4 f...
Towards a pattern recognition approach for transferring knowledge in acm v4 f...Towards a pattern recognition approach for transferring knowledge in acm v4 f...
Towards a pattern recognition approach for transferring knowledge in acm v4 f...Thanh Tran
 
A/B testing from basic concepts to advanced techniques
A/B testing  from basic concepts to advanced techniquesA/B testing  from basic concepts to advanced techniques
A/B testing from basic concepts to advanced techniquesAnatoliy Vuets
 
Statistical hypothesis testing in e commerce
Statistical hypothesis testing in e commerceStatistical hypothesis testing in e commerce
Statistical hypothesis testing in e commerceAnatoliy Vuets
 
A Top-N Recommender System Evaluation Protocol Inspired by Deployed Systems
A Top-N Recommender System Evaluation Protocol Inspired by Deployed SystemsA Top-N Recommender System Evaluation Protocol Inspired by Deployed Systems
A Top-N Recommender System Evaluation Protocol Inspired by Deployed SystemsAlan Said
 
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...Alan Said
 
Programming with GUTs
Programming with GUTsProgramming with GUTs
Programming with GUTscatherinewall
 
AUTOMATIC GENERATION AND OPTIMIZATION OF TEST DATA USING HARMONY SEARCH ALGOR...
AUTOMATIC GENERATION AND OPTIMIZATION OF TEST DATA USING HARMONY SEARCH ALGOR...AUTOMATIC GENERATION AND OPTIMIZATION OF TEST DATA USING HARMONY SEARCH ALGOR...
AUTOMATIC GENERATION AND OPTIMIZATION OF TEST DATA USING HARMONY SEARCH ALGOR...csandit
 
Test design made easy (and fun) Rik Marselis EuroSTAR
Test design made easy (and fun) Rik Marselis EuroSTARTest design made easy (and fun) Rik Marselis EuroSTAR
Test design made easy (and fun) Rik Marselis EuroSTARRik Marselis
 
Expanding our Testing Horizons
Expanding our Testing HorizonsExpanding our Testing Horizons
Expanding our Testing HorizonsMark Micallef
 
No struggle with test design (presentation at TestExpo 2015 Denmark)
No struggle with test design (presentation at TestExpo 2015 Denmark)No struggle with test design (presentation at TestExpo 2015 Denmark)
No struggle with test design (presentation at TestExpo 2015 Denmark)Rik Marselis
 
Caveon webinar series - smart items- using innovative item design to make you...
Caveon webinar series - smart items- using innovative item design to make you...Caveon webinar series - smart items- using innovative item design to make you...
Caveon webinar series - smart items- using innovative item design to make you...Caveon Test Security
 
Chaplin school of hospitality and tourism management inter
Chaplin school of hospitality and tourism management interChaplin school of hospitality and tourism management inter
Chaplin school of hospitality and tourism management interRAJU852744
 
Bridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
Bridging the Gap: Machine Learning for Ubiquitous Computing -- EvaluationBridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
Bridging the Gap: Machine Learning for Ubiquitous Computing -- EvaluationThomas Ploetz
 
Applied Psych Test Design: Part C - Use of Rasch scaling technology
Applied Psych Test Design: Part C - Use of Rasch scaling technologyApplied Psych Test Design: Part C - Use of Rasch scaling technology
Applied Psych Test Design: Part C - Use of Rasch scaling technologyKevin McGrew
 
Guidelines to Understanding Design of Experiment and Reliability Prediction
Guidelines to Understanding Design of Experiment and Reliability PredictionGuidelines to Understanding Design of Experiment and Reliability Prediction
Guidelines to Understanding Design of Experiment and Reliability Predictionijsrd.com
 

La actualidad más candente (20)

Computer based test designs (cbt)
Computer based test designs (cbt)Computer based test designs (cbt)
Computer based test designs (cbt)
 
Test equating using irt. final
Test equating using irt. finalTest equating using irt. final
Test equating using irt. final
 
Test construction
Test constructionTest construction
Test construction
 
Towards a pattern recognition approach for transferring knowledge in acm v4 f...
Towards a pattern recognition approach for transferring knowledge in acm v4 f...Towards a pattern recognition approach for transferring knowledge in acm v4 f...
Towards a pattern recognition approach for transferring knowledge in acm v4 f...
 
A/B testing from basic concepts to advanced techniques
A/B testing  from basic concepts to advanced techniquesA/B testing  from basic concepts to advanced techniques
A/B testing from basic concepts to advanced techniques
 
Statistical hypothesis testing in e commerce
Statistical hypothesis testing in e commerceStatistical hypothesis testing in e commerce
Statistical hypothesis testing in e commerce
 
A Top-N Recommender System Evaluation Protocol Inspired by Deployed Systems
A Top-N Recommender System Evaluation Protocol Inspired by Deployed SystemsA Top-N Recommender System Evaluation Protocol Inspired by Deployed Systems
A Top-N Recommender System Evaluation Protocol Inspired by Deployed Systems
 
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
 
Programming with GUTs
Programming with GUTsProgramming with GUTs
Programming with GUTs
 
AUTOMATIC GENERATION AND OPTIMIZATION OF TEST DATA USING HARMONY SEARCH ALGOR...
AUTOMATIC GENERATION AND OPTIMIZATION OF TEST DATA USING HARMONY SEARCH ALGOR...AUTOMATIC GENERATION AND OPTIMIZATION OF TEST DATA USING HARMONY SEARCH ALGOR...
AUTOMATIC GENERATION AND OPTIMIZATION OF TEST DATA USING HARMONY SEARCH ALGOR...
 
Test design made easy (and fun) Rik Marselis EuroSTAR
Test design made easy (and fun) Rik Marselis EuroSTARTest design made easy (and fun) Rik Marselis EuroSTAR
Test design made easy (and fun) Rik Marselis EuroSTAR
 
Expanding our Testing Horizons
Expanding our Testing HorizonsExpanding our Testing Horizons
Expanding our Testing Horizons
 
No struggle with test design (presentation at TestExpo 2015 Denmark)
No struggle with test design (presentation at TestExpo 2015 Denmark)No struggle with test design (presentation at TestExpo 2015 Denmark)
No struggle with test design (presentation at TestExpo 2015 Denmark)
 
Caveon webinar series - smart items- using innovative item design to make you...
Caveon webinar series - smart items- using innovative item design to make you...Caveon webinar series - smart items- using innovative item design to make you...
Caveon webinar series - smart items- using innovative item design to make you...
 
Chaplin school of hospitality and tourism management inter
Chaplin school of hospitality and tourism management interChaplin school of hospitality and tourism management inter
Chaplin school of hospitality and tourism management inter
 
Bridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
Bridging the Gap: Machine Learning for Ubiquitous Computing -- EvaluationBridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
Bridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
 
Applied Psych Test Design: Part C - Use of Rasch scaling technology
Applied Psych Test Design: Part C - Use of Rasch scaling technologyApplied Psych Test Design: Part C - Use of Rasch scaling technology
Applied Psych Test Design: Part C - Use of Rasch scaling technology
 
Bad Metric, Bad!
Bad Metric, Bad!Bad Metric, Bad!
Bad Metric, Bad!
 
Guidelines to Understanding Design of Experiment and Reliability Prediction
Guidelines to Understanding Design of Experiment and Reliability PredictionGuidelines to Understanding Design of Experiment and Reliability Prediction
Guidelines to Understanding Design of Experiment and Reliability Prediction
 
De carlo rizk 2010 icelw
De carlo rizk 2010 icelwDe carlo rizk 2010 icelw
De carlo rizk 2010 icelw
 

Similar a Creating an in-house computerized adaptive testing (CAT) program with Concerto

A framework and approaches to develop an in-house CAT with freeware and open ...
A framework and approaches to develop an in-house CAT with freeware and open ...A framework and approaches to develop an in-house CAT with freeware and open ...
A framework and approaches to develop an in-house CAT with freeware and open ...Tetsuo Kimura
 
Automated Machine Learning Applied to Diverse Materials Design Problems
Automated Machine Learning Applied to Diverse Materials Design ProblemsAutomated Machine Learning Applied to Diverse Materials Design Problems
Automated Machine Learning Applied to Diverse Materials Design ProblemsAnubhav Jain
 
MSPresentation_Spring2011
MSPresentation_Spring2011MSPresentation_Spring2011
MSPresentation_Spring2011Shaun Smith
 
⭐⭐⭐⭐⭐ Finding a Dynamical Model of a Social Norm Physical Activity Intervention
⭐⭐⭐⭐⭐ Finding a Dynamical Model of a Social Norm Physical Activity Intervention⭐⭐⭐⭐⭐ Finding a Dynamical Model of a Social Norm Physical Activity Intervention
⭐⭐⭐⭐⭐ Finding a Dynamical Model of a Social Norm Physical Activity InterventionVictor Asanza
 
Topic Set Size Design with Variance Estimates from Two-Way ANOVA
Topic Set Size Design with Variance Estimates from Two-Way ANOVATopic Set Size Design with Variance Estimates from Two-Way ANOVA
Topic Set Size Design with Variance Estimates from Two-Way ANOVATetsuya Sakai
 
RS in the context of Big Data-v4
RS in the context of Big Data-v4RS in the context of Big Data-v4
RS in the context of Big Data-v4Khadija Atiya
 
Can AI Tell Emerging Technologies
Can AI Tell Emerging TechnologiesCan AI Tell Emerging Technologies
Can AI Tell Emerging TechnologiesSeonho Kim
 
Fundamentals of Engineering Design
Fundamentals of Engineering DesignFundamentals of Engineering Design
Fundamentals of Engineering Designasuarea48
 
ES2022-Minh-Nguyen-ShapingTestsIntoModelsForAutomatedTCGeneration.pdf
ES2022-Minh-Nguyen-ShapingTestsIntoModelsForAutomatedTCGeneration.pdfES2022-Minh-Nguyen-ShapingTestsIntoModelsForAutomatedTCGeneration.pdf
ES2022-Minh-Nguyen-ShapingTestsIntoModelsForAutomatedTCGeneration.pdfMinh Nguyen
 
Model-based programming and AI-assisted software development
Model-based programming and AI-assisted software developmentModel-based programming and AI-assisted software development
Model-based programming and AI-assisted software developmentEficode
 

Similar a Creating an in-house computerized adaptive testing (CAT) program with Concerto (20)

A framework and approaches to develop an in-house CAT with freeware and open ...
A framework and approaches to develop an in-house CAT with freeware and open ...A framework and approaches to develop an in-house CAT with freeware and open ...
A framework and approaches to develop an in-house CAT with freeware and open ...
 
Automated Machine Learning Applied to Diverse Materials Design Problems
Automated Machine Learning Applied to Diverse Materials Design ProblemsAutomated Machine Learning Applied to Diverse Materials Design Problems
Automated Machine Learning Applied to Diverse Materials Design Problems
 
ictir2016
ictir2016ictir2016
ictir2016
 
AIRS2016
AIRS2016AIRS2016
AIRS2016
 
MSPresentation_Spring2011
MSPresentation_Spring2011MSPresentation_Spring2011
MSPresentation_Spring2011
 
⭐⭐⭐⭐⭐ Finding a Dynamical Model of a Social Norm Physical Activity Intervention
⭐⭐⭐⭐⭐ Finding a Dynamical Model of a Social Norm Physical Activity Intervention⭐⭐⭐⭐⭐ Finding a Dynamical Model of a Social Norm Physical Activity Intervention
⭐⭐⭐⭐⭐ Finding a Dynamical Model of a Social Norm Physical Activity Intervention
 
Topic Set Size Design with Variance Estimates from Two-Way ANOVA
Topic Set Size Design with Variance Estimates from Two-Way ANOVATopic Set Size Design with Variance Estimates from Two-Way ANOVA
Topic Set Size Design with Variance Estimates from Two-Way ANOVA
 
Sbst2018 contest2018
Sbst2018 contest2018Sbst2018 contest2018
Sbst2018 contest2018
 
Pathogen phylogenetics using BEAST
Pathogen phylogenetics using BEASTPathogen phylogenetics using BEAST
Pathogen phylogenetics using BEAST
 
RS in the context of Big Data-v4
RS in the context of Big Data-v4RS in the context of Big Data-v4
RS in the context of Big Data-v4
 
Attribute MSA
Attribute MSAAttribute MSA
Attribute MSA
 
Attribute MSA
Attribute MSA Attribute MSA
Attribute MSA
 
Introduction to Bayesian phylogenetics and BEAST
Introduction to Bayesian phylogenetics and BEASTIntroduction to Bayesian phylogenetics and BEAST
Introduction to Bayesian phylogenetics and BEAST
 
Qualificacao acd
Qualificacao acdQualificacao acd
Qualificacao acd
 
Can AI Tell Emerging Technologies
Can AI Tell Emerging TechnologiesCan AI Tell Emerging Technologies
Can AI Tell Emerging Technologies
 
Fundamentals of Engineering Design
Fundamentals of Engineering DesignFundamentals of Engineering Design
Fundamentals of Engineering Design
 
ES2022-Minh-Nguyen-ShapingTestsIntoModelsForAutomatedTCGeneration.pdf
ES2022-Minh-Nguyen-ShapingTestsIntoModelsForAutomatedTCGeneration.pdfES2022-Minh-Nguyen-ShapingTestsIntoModelsForAutomatedTCGeneration.pdf
ES2022-Minh-Nguyen-ShapingTestsIntoModelsForAutomatedTCGeneration.pdf
 
Model-based programming and AI-assisted software development
Model-based programming and AI-assisted software developmentModel-based programming and AI-assisted software development
Model-based programming and AI-assisted software development
 
BIRTE-13-Kawashima
BIRTE-13-KawashimaBIRTE-13-Kawashima
BIRTE-13-Kawashima
 
When Should I Use Simulation?
When Should I Use Simulation?When Should I Use Simulation?
When Should I Use Simulation?
 

Más de Mizumoto Atsushi

2021-0509_JAECS2021_Spring
2021-0509_JAECS2021_Spring2021-0509_JAECS2021_Spring
2021-0509_JAECS2021_SpringMizumoto Atsushi
 
2015-1003 英語コーパス学会ワークショップ使用スライド
2015-1003 英語コーパス学会ワークショップ使用スライド2015-1003 英語コーパス学会ワークショップ使用スライド
2015-1003 英語コーパス学会ワークショップ使用スライドMizumoto Atsushi
 
LET2015 National Conference Seminar
LET2015 National Conference SeminarLET2015 National Conference Seminar
LET2015 National Conference SeminarMizumoto Atsushi
 
JSSS2014 Symposium (Atsushi Mizumoto)
JSSS2014 Symposium (Atsushi Mizumoto)JSSS2014 Symposium (Atsushi Mizumoto)
JSSS2014 Symposium (Atsushi Mizumoto)Mizumoto Atsushi
 
SappoRo.R #3 LT: Shiny by RStudio
SappoRo.R #3 LT: Shiny by RStudioSappoRo.R #3 LT: Shiny by RStudio
SappoRo.R #3 LT: Shiny by RStudioMizumoto Atsushi
 
量的データの分析・報告で気をつけたいこと
量的データの分析・報告で気をつけたいこと量的データの分析・報告で気をつけたいこと
量的データの分析・報告で気をつけたいことMizumoto Atsushi
 
2013全国英語教育学会WS公開用スライド
2013全国英語教育学会WS公開用スライド2013全国英語教育学会WS公開用スライド
2013全国英語教育学会WS公開用スライドMizumoto Atsushi
 
Rを使ったコンピュータ適応型テスト構築の試み
Rを使ったコンピュータ適応型テスト構築の試みRを使ったコンピュータ適応型テスト構築の試み
Rを使ったコンピュータ適応型テスト構築の試みMizumoto Atsushi
 
Let中部2012シンポスライド
Let中部2012シンポスライドLet中部2012シンポスライド
Let中部2012シンポスライドMizumoto Atsushi
 
Excelを使った統計解析とグラフ化入門
Excelを使った統計解析とグラフ化入門Excelを使った統計解析とグラフ化入門
Excelを使った統計解析とグラフ化入門Mizumoto Atsushi
 
2012-1110「マルチレベルモデルのはなし」(censored)
2012-1110「マルチレベルモデルのはなし」(censored)2012-1110「マルチレベルモデルのはなし」(censored)
2012-1110「マルチレベルモデルのはなし」(censored)Mizumoto Atsushi
 

Más de Mizumoto Atsushi (12)

2021-0509_JAECS2021_Spring
2021-0509_JAECS2021_Spring2021-0509_JAECS2021_Spring
2021-0509_JAECS2021_Spring
 
2015-1003 英語コーパス学会ワークショップ使用スライド
2015-1003 英語コーパス学会ワークショップ使用スライド2015-1003 英語コーパス学会ワークショップ使用スライド
2015-1003 英語コーパス学会ワークショップ使用スライド
 
LET2015 National Conference Seminar
LET2015 National Conference SeminarLET2015 National Conference Seminar
LET2015 National Conference Seminar
 
JSSS2014 Symposium (Atsushi Mizumoto)
JSSS2014 Symposium (Atsushi Mizumoto)JSSS2014 Symposium (Atsushi Mizumoto)
JSSS2014 Symposium (Atsushi Mizumoto)
 
SappoRo.R #3 LT: Shiny by RStudio
SappoRo.R #3 LT: Shiny by RStudioSappoRo.R #3 LT: Shiny by RStudio
SappoRo.R #3 LT: Shiny by RStudio
 
量的データの分析・報告で気をつけたいこと
量的データの分析・報告で気をつけたいこと量的データの分析・報告で気をつけたいこと
量的データの分析・報告で気をつけたいこと
 
2013 11 jacet-kansai-ws
2013 11 jacet-kansai-ws2013 11 jacet-kansai-ws
2013 11 jacet-kansai-ws
 
2013全国英語教育学会WS公開用スライド
2013全国英語教育学会WS公開用スライド2013全国英語教育学会WS公開用スライド
2013全国英語教育学会WS公開用スライド
 
Rを使ったコンピュータ適応型テスト構築の試み
Rを使ったコンピュータ適応型テスト構築の試みRを使ったコンピュータ適応型テスト構築の試み
Rを使ったコンピュータ適応型テスト構築の試み
 
Let中部2012シンポスライド
Let中部2012シンポスライドLet中部2012シンポスライド
Let中部2012シンポスライド
 
Excelを使った統計解析とグラフ化入門
Excelを使った統計解析とグラフ化入門Excelを使った統計解析とグラフ化入門
Excelを使った統計解析とグラフ化入門
 
2012-1110「マルチレベルモデルのはなし」(censored)
2012-1110「マルチレベルモデルのはなし」(censored)2012-1110「マルチレベルモデルのはなし」(censored)
2012-1110「マルチレベルモデルのはなし」(censored)
 

Último

Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...Nguyen Thanh Tu Collection
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Pooja Bhuva
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxAmanpreet Kaur
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxPooja Bhuva
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsKarakKing
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - Englishneillewis46
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxJisc
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 

Último (20)

Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 

Creating an in-house computerized adaptive testing (CAT) program with Concerto

  • 1. Creating an in-house computerized adaptive testing (CAT) program with Concerto Atsushi, MIZUMOTO (Kansai University) 2013/09/20 JLTA at Waseda University
  • 4. CTT vs. IRT Aspect CTT IRT Test score Ordinal scale Interval scale Ability estimate Test-dependent Test-independent Test result Person-dependent Person-independent Measurement target (Precision) All test-takers Individuals Equating/CAT Difficult Easy Ohtomo (2009)
  • 6. History of CAT Research 40 years (Thomson & Weiss, 2011)) 30 in LT (Koyama, 2010))
  • 11. Advantages of CAT •Tailored for individual test-takers •Shorter test time •More precision (= SE smaller) •No need for random sampling www.geocities.jp/kosugitti/labo/irtnote.pdf
  • 12. Purposes •Creating a CAT program •Evaluation
  • 13. Creating a CAT Program •Choosing the CAT System •Constructing an Item Bank (Pretest) •Calibrating the Item Bank •Determine Specifications & Feedback •Administering the CAT
  • 14. Creating a CAT Program •Choosing the CAT System •Constructing an Item Bank (Pretest) •Calibrating the Item Bank •Determine Specifications & Feedback •Administering the CAT
  • 16.
  • 17.
  • 18.
  • 19. 1. Free account(150 test takers/month) 2. Amazon Machine Images(Free for a year) 3. Installing it on your own server
  • 20. •Open-source •Running R on a server (catR, RMySQL) •HTML-based
  • 21. Installation on a server https://code.google.com/p/concerto-platform/wiki/installation4
  • 23. Creating a CAT Program •Choosing the CAT System •Constructing an Item Bank (Pretest) •Calibrating the Item Bank •Determine Specifications & Feedback •Administering the CAT
  • 24. Creating a CAT Program •Choosing the CAT System •Constructing an Item Bank (Pretest) •Calibrating the Item Bank •Determine Specifications & Feedback •Administering the CAT
  • 25. Constructing an Item Bank (Pretest) •Vocabulary Test (Mizumoto, 2006) http://www.mizumot.com/files/VocSizeMeasure.pdf •Based on SVL 12,000 (Up to 8,000 level; 30 items for each level) •716 university EFL learners
  • 26. Sample Question (1) 心の, 精神の A. essential B. creative C. loose D. mental
  • 27. Creating a CAT Program •Choosing the CAT System •Constructing an Item Bank (Pretest) •Calibrating the Item Bank •Determine Specifications & Feedback •Administering the CAT
  • 28. Creating a CAT Program •Choosing the CAT System •Constructing an Item Bank (Pretest) •Calibrating the Item Bank •Determine Specifications & Feedback •Administering the CAT
  • 29. Calibrating the Item Bank •240 items analyzed (Rasch model) •150 items left for the item bank •Calibrated with two parameter logistic model (item difficulty & discrimination) •Update the csv file to Concerto
  • 30. Creating a CAT Program •Choosing the CAT System •Constructing an Item Bank (Pretest) •Calibrating the Item Bank •Determine Specifications & Feedback •Administering the CAT
  • 31. Creating a CAT Program •Choosing the CAT System •Constructing an Item Bank (Pretest) •Calibrating the Item Bank •Determine Specifications & Feedback •Administering the CAT
  • 32. Specifications of CAT •Starting point (parameters, initial ability, randmized/fixed) •Ability estimation method (empirical Bayes and others) •Stopping rule (Number of items/Standard error) •Final ability estimation
  • 33. Magis and Raîche (2012, p. 7)
  • 34. How many items for what SE? •Simulation with catR package Magis, D., & Raîche, G. (2012). http://www.jstatsoft.org/v48/i08
  • 35. True Theta = 1, SE = 0.3 Stopping rule = 30 items
  • 38.
  • 40. Creating a CAT Program •Choosing the CAT System •Constructing an Item Bank (Pretest) •Calibrating the Item Bank •Determine Specifications & Feedback •Administering the CAT
  • 41. Creating a CAT Program •Choosing the CAT System •Constructing an Item Bank (Pretest) •Calibrating the Item Bank •Determine Specifications & Feedback •Administering the CAT
  • 42. 268 test takers (university first year) (1) CAT (2) Paper-pencil version (68 items) common person linking (3) Questionnaire “What did you think of the CAT result?”
  • 44. CAT Theta 0 1 2 3 4 -10123 0.92 -1 0 1 2 3 01234 Paper-pencil Theta n = 268 Random 30Qs Fixed 68Qs
  • 45. -1 0 1 2 3 01234 Pape n = 268 CAT (30Qs) M = 1.71 SD = 1.13 P-P (68Qs) M = 1.72 SD = 0.95
  • 46. -1 0 1 2 3 01234 Pape n = 268 CAT (30Qs) M = 1.71 SD = 1.13 P-P (68Qs) M = 1.72 SD = 0.95 Mean diff. = -0.02 95% CI [-0.07, 0.04] d = 0.01 Power = .06
  • 47. -1 0 1 2 3 01234 Pape n = 268 CAT SE (30Qs) M = 0.39 SD = 0.11 P-P SE (68Qs) M = 1.71 SD = 1.13
  • 48. -1 0 1 2 3 01234 Pape n = 268 CAT SE (30Qs) M = 0.39 SD = 0.11 P-P SE (68Qs) M = 1.71 SD = 1.13 Mean diff. of SE = -1.32 95% CI [-1.44, -1.19] d = 1.65 Power = 0.99
  • 49. Evaluation CAT vs. Paper-pencil Means: CAT = Paper-pencil SEs: CAT < Paper-pencil CAT measures the same ability with much more precision (with fewer items).
  • 51. Result of the Questionnaire Frequency Response 150 100 50 0 50 100 150 Very inaccurate Inaccurate Rather Inaccurate Rather accurate Accurate Very accurate
  • 53. Future Research •More items in the item bank •Better formula for predicting other test scores •Improved feedback •Collaboration
  • 54. Summary •Created a CAT program •Evaluation (1) CAT better than Paper-pencil (2) Feedback needs improvement.