Analysis of the Generalization of Students’
Success Predictive Models in a Series of Java
MOOCs on edX
Pedro Manuel Moreno Marcos
Pedro Manuel Moreno-Marcos, Miguel Rodríguez Guillén,
Carlos Alario-Hoyos, Pedro J. Muñoz-Merino, Iria Estévez-Ayres and
Carlos Delgado Kloos
Ninth European MOOCs stakeholders Summit 2025 (eMOOcs 2025)
Palaiseau, France, June 30th 2025
INDEX
1. Introduction
2. Methodology
3. Results
4. Conclusions
2
INTRODUCTION
Several approaches: transfer models, in-situ,
global models
Limitations of the models due to generalizability
High dropout rates in MOOCs → Predictive models
3
OBJECTIVES
• Analyze to what extent predictive models to forecast
success and grades can generalize to:
4
O1. other MOOCs on the same topic
O2. the same MOOCs in a different language
O3. the same MOOCs in a different instruction mode
(teacher paced / learner paced)
INDEX
1. Introduction
2. Methodology
3. Results
4. Conclusions
5
EDUCATIONAL CONTEXT
• Three MOOCs about Java programming on edX
• Similar grading criteria
• Different languages and instruction modes
6
VARIABLES
Dependent variables: final grade and pass/fail
• User level: avg. formative exercises, % formative
exercises, no. participations in the forum, etc.
• Course level: version, no. exercises, no. videos, duration,
language, instructor mode, average grade
Independent variables
7
ANALYTICAL METHODS
ALGORITHMS
Decision
Trees (DT)
Random
Forest (RF)
k-Nearest
Neighbors
(kNN)
Gradient
Boosting
(GB)
METRICS
8
• Root Mean Square
Error (RMSE)
Final grade
• Area Under the Curve
(AUC)
Pass/Fail
INDEX
1. Introduction
2. Methodology
3. Results
4. Conclusions
9
O1. TRANSFER MODEL BETWEEN
COURSES
• Stable results excepting when training with C3
• In some cases, transferred model is slightly better
• Combined models are slightly better
10
O2. TRANSFER MODELS WITH COURSES
IN DIFFERENT LANGUAGES
• Models computed with GB
• Use of English to train and predict
• Small differences when predicting Spanish Courses
• No improvement when training with both courses
11
O3. TRANSFER MODELS WITH COURSES
WITH DIFFERENT INSTRUCTION MODES
• Little differences are observed
• Results are slightly better for learner paced when
training with data from teacher paced courses → no
significant
12
INDEX
1. Introduction
2. Methodology
3. Results
4. Conclusions
13
CONCLUSIONS
14
It is posible to achieve a good level of generalizability in
general
Possible effect of the course context and sample size
Combination of courses might be beneficial
Slightly drop when modifying the language (higher in English)
Low differences depending on the instruction mode
LIMITATIONS AND FUTURE WORK
15
Limitations
Only one series of MOOCs was used
Sample size might affect the results
Only students who interacted with
exercises were considered
Future work
Analyze more different MOOCs
Analyze the effect of specific
variables on generalizability
Add new variables
Carry out experiments in live courses
ACKNOWLEDGEMENTS
• Universidad Carlos III de Madrid (UC3M) through the
Grants for the Research Activity of Young Doctors of
the UC3M’s Own Research and Transfer Program
(ASESOR-IA project)
• GENIE Learn project - Grants PID2023-146692OB-
C31, funded by MICIU/AEI/10.13039/501100011033
and by ERDF/EU. Website:
https://genielearn.uc3m.es/
• UNESCO Chair of “Scalable Digital Education for All”
at UC3M
• Grant RED2022-134284-T (SNOLA project) funded by
MICIU/AEI/10.13039/501100011033.
16
17

Generalization predition MOOCs - Conference presentation - eMOOCs 2025

  • 1.
    Analysis of theGeneralization of Students’ Success Predictive Models in a Series of Java MOOCs on edX Pedro Manuel Moreno Marcos Pedro Manuel Moreno-Marcos, Miguel Rodríguez Guillén, Carlos Alario-Hoyos, Pedro J. Muñoz-Merino, Iria Estévez-Ayres and Carlos Delgado Kloos Ninth European MOOCs stakeholders Summit 2025 (eMOOcs 2025) Palaiseau, France, June 30th 2025
  • 2.
  • 3.
    INTRODUCTION Several approaches: transfermodels, in-situ, global models Limitations of the models due to generalizability High dropout rates in MOOCs → Predictive models 3
  • 4.
    OBJECTIVES • Analyze towhat extent predictive models to forecast success and grades can generalize to: 4 O1. other MOOCs on the same topic O2. the same MOOCs in a different language O3. the same MOOCs in a different instruction mode (teacher paced / learner paced)
  • 5.
  • 6.
    EDUCATIONAL CONTEXT • ThreeMOOCs about Java programming on edX • Similar grading criteria • Different languages and instruction modes 6
  • 7.
    VARIABLES Dependent variables: finalgrade and pass/fail • User level: avg. formative exercises, % formative exercises, no. participations in the forum, etc. • Course level: version, no. exercises, no. videos, duration, language, instructor mode, average grade Independent variables 7
  • 8.
    ANALYTICAL METHODS ALGORITHMS Decision Trees (DT) Random Forest(RF) k-Nearest Neighbors (kNN) Gradient Boosting (GB) METRICS 8 • Root Mean Square Error (RMSE) Final grade • Area Under the Curve (AUC) Pass/Fail
  • 9.
  • 10.
    O1. TRANSFER MODELBETWEEN COURSES • Stable results excepting when training with C3 • In some cases, transferred model is slightly better • Combined models are slightly better 10
  • 11.
    O2. TRANSFER MODELSWITH COURSES IN DIFFERENT LANGUAGES • Models computed with GB • Use of English to train and predict • Small differences when predicting Spanish Courses • No improvement when training with both courses 11
  • 12.
    O3. TRANSFER MODELSWITH COURSES WITH DIFFERENT INSTRUCTION MODES • Little differences are observed • Results are slightly better for learner paced when training with data from teacher paced courses → no significant 12
  • 13.
    INDEX 1. Introduction 2. Methodology 3.Results 4. Conclusions 13
  • 14.
    CONCLUSIONS 14 It is posibleto achieve a good level of generalizability in general Possible effect of the course context and sample size Combination of courses might be beneficial Slightly drop when modifying the language (higher in English) Low differences depending on the instruction mode
  • 15.
    LIMITATIONS AND FUTUREWORK 15 Limitations Only one series of MOOCs was used Sample size might affect the results Only students who interacted with exercises were considered Future work Analyze more different MOOCs Analyze the effect of specific variables on generalizability Add new variables Carry out experiments in live courses
  • 16.
    ACKNOWLEDGEMENTS • Universidad CarlosIII de Madrid (UC3M) through the Grants for the Research Activity of Young Doctors of the UC3M’s Own Research and Transfer Program (ASESOR-IA project) • GENIE Learn project - Grants PID2023-146692OB- C31, funded by MICIU/AEI/10.13039/501100011033 and by ERDF/EU. Website: https://genielearn.uc3m.es/ • UNESCO Chair of “Scalable Digital Education for All” at UC3M • Grant RED2022-134284-T (SNOLA project) funded by MICIU/AEI/10.13039/501100011033. 16
  • 17.