Theoretical overview and two examples of Data Archeology - a need to deeply understand context and engage in ground-truthing when analyzing large sets of digital data.
Seal of Good Local Governance (SGLG) 2024Final.pptx
Data Archeology - A theory- and context-informed approach to analyzing data traces
1. DATA ARCHEOLOGY
Image Credit: Pedro Szekely via Flickr (CC BY 2.0), adapted
A T H E O R Y - I N F O R M E D A P P R O A C H T O A N A LY Z I N G D ATA
T R A C E S O F S O C I A L I N T E R A C T I O N I N L A R G E S C A L E
L E A R N I N G E N V I R O N M E N T S
ALYSSA WISE
S I M O N F R A S E R U N I V E R S I T Y
2. OVERVIEW
DATA ARCHEOLOGY – THE BIG IDEA
APPLICATION
THE ROLE OF “LISTENING” IN ONLINE
DISCUSSIONS
SOCIAL INTERACTION IN LARGE SCALE
LEARNING ENVIRONMENTS
CONCLUSION
9. MOVING BEYOND
“MORE IS BETTER”
AS A LEARNING MODEL TO
PROBE WHAT KINDS OF THINGS
ARE BETTER FOR WHAT
PURPOSES AND WHY
THEORETICALLY-INFORMED
10. ATTENDING TO THE
PEDAGOGICAL CONTEXT AS A
CRITICAL FRAME FOR
INTERPRETING THE PAST
ACTIVITY THAT OCCURRED
LEARNING “CIVILIZATION”
11. THE COMPLETE AGGREGATED
DATA RECORD AVAILABLE AT
THE END DOESN’T REFLECT
THE DYNAMIC ENVIRONMENT
IN WHICH THE ACTIVITY
OCCURRED
TEMPORALITY & TRAJECTORIES
12. WHAT TOOLS ARE THERE IN
THE ONLINE ENVIRONMENT ?
(RE)FRAMING QUESTIONS
WHAT IS THE
PURPOSE OF THE
LEARNING
ACTIVITIES
CONDUCTED IN
THE TOOLS ?
HOW MUCH DO
STUDENTS USE THEM?
WHAT ARE
THEORETICALLY
DESIRABLE
PAT TERNS OF
PARTICIPATION ?
HOW CAN THESE
BEST BE
PROXIED BY THE
AVAILABLE
DATA?
FROM
TO
F O R M O R E O N C O N N E C T I N G L E A R N I N G A N A L Y T I C S + L E A R N I N G
D E S I G N S E E L O C K Y E R , H E A T H C O T E & D A W S O N [ 2 0 1 3 ]
14. AN ONLINE DISCUSSION FORUM IS A TOOL
IT’S EDUCATIONAL PURPOSE CAN CHANGE
Q & A
Peer
Review
Dialogue
Reading
Response
Team
Decision
Making
Argumen-
tation
15. DIFFERENT PURPOSES
FOR A DISCUSSION FORUM
IMPLY DIFFERENT
EXPECTATIONS FOR
DESIRED PATTERNS OF USE
16.
17. ONLINE DISCUSSION
LEARNING PURPOSE
Externalizing one’s
ideas by contributing
posts to an online
discussion
Taking in the
externalizations of
others by accessing
existing posts
• Social constructivist perspective - online discussions as a forum
for learning through dialogue
• Learning occurs as students articulate their ideas, are exposed to
the ideas of others, and negotiate differences in perspective
• Focus on how students contribute comments (“speak”),
attend to other’s messages (“listen”), and the cxns bet them
18. UNDERLYING THEORY OF
ONLINE “LISTENING”
Listening not Lurking
Lurker
• Specific person who participates
passively
• Accesses existing comments but
does not contribute
• Negative connotation
Listening
• Active process conducted by anyone in
online discussion
• Activity interrelated with contributing.
• Productive element of discussion
participation
Listening
• Specific term (online discussions)
• Dynamic text, distinct sub-units
• Multi-authored
• Generating a response often involved
Reading
• Generic term (all written text)
• Static, cohesive text
• Single author
• Does not require response
Listening not Reading
19. Speaking
Mechanism for sharing ideas
Value in speaking that is
Relevant to the topic at hand
Rationaled with evidence
Recurring and distributed
Moderately portioned
Responsive to the
conversation
Listening
Mechanism for becoming
aware of ideas
Value in listening that is
Broad (to consider a diversity of ideas)
Deep (to consider ideas in earnest)
Recursive (to provide context for
discussion flow)
Integrated (attending to connected
rather than scattered comments)
ONLINE DISCUSSION
LEARNING MODEL
20. ONLINE DISCUSSION
PEDAGOGICAL CONTEXT
• Group and Timing
– Small group discussions (~8-12 students)
– Random assignment (would be better with differing perspectives)
– Discussions run on a weekly schedule with course
• Task
– Contested real-world challenges (business, edu psychology)
– Given two viable contrasting perspectives, come to consensus
– Share decision with rationale with whole class
• Expectations
– Given criteria / guidelines for speaking and listening
– Assessment varies (individual/group, student/instructor driven)
23. Listening
Mechanism for becoming
aware of ideas
Value in listening that is
Broad (to consider a diversity of ideas)
Deep (to consider ideas in earnest)
Recursive (to provide context for
discussion flow)
Integrated (attending to connected
rather than scattered comments)
ONLINE DISCUSSION
LEARNING MODEL
24. Criteria Metric Definition
Breadth
% posts viewed
Number of unique posts that a student viewed divided
by the total number of posts in the discussion
% posts read
Number of unique posts that a student read divided by
the total number of posts in the discussion
Depth
% (real) reads
Number of times a student viewed other’s posts at < 6.5
wps, divided by the total number of views
Av length of real reads
(min)
Total time spent reading posts, divided by the number of
reads (after scans removed )
Recursiveness
# of reviews of others’
posts
Number of times a student revisited posts that they had
viewed previously in the discussion
Integration
Posts read connected,
not scattered
Concentration of posts viewed by a student in the
discussion space* [thread-density, network metrics..]
ONLINE DISCUSSION
LISTENING ANALYTICS
25. Speaking
Mechanism for sharing ideas
Value in speaking that is
Relevant to the topic at hand
Rationaled with evidence
Recurring and distributed
Moderately portioned
Responsive to the
conversation
ONLINE DISCUSSION
LEARNING MODEL
26. Criteria Metric Definition
Recurring
Number of posts
Total number of posts a student contributed to the
discussion
Percent of sessions with
posts
Number of sessions in which a student made a post,
divided by their total of number sessions
Moderately
Portioned
Average post length
Total number of words posted by a student divided by
the number of posts they made to the discussion
Responsive
Depth of response to
existing conversation
0 None
1 Acknowledging
2 Responding to an idea
3 Responding to multiple ideas
Rationaled
Degree of
argumentation
0 No argumentation
1 Unsupported argumentation (Position only)
2 Simple argumentation (Position + Reasoning/Evidence)
3 Complex argumentation (Position + Reasoning
/Evidence+ Qualifier/Rebuttal)
ONLINE DISCUSSION
SPEAKING ANALYTICS
27. SOME RESULTS
[ W I S E , H S I A O E T A L . 2 0 1 2 ]
[ W I S E , P E R E R A E T A L . , 2 0 1 2 ]
[ W I S E , S P E E R E T A L . 2 0 1 3 ]
Depth
Breadth (% of posts viewed)
Low High
Low Disregardful Coverage
High Focused Thorough
Un-engaged Engaged
29. SOME RESULTS
Depth (% of
real reads)
Breadth (% of posts viewed)
Low High
Low Disregardful Coverage
High Focused Thorough
Un-
engaged
Engaged
30. SOME MORE RESULTS
[ W I S E , H AU S K N E C H T & Z H A O , 2 0 1 4 ]
Greater
Listening Depth
(% of real reads)
Listening Recursiveness
(# reviews of others’ posts)
Associated with
More Rationaled Speaking
More Responsive Speaking
Listening Breadth not associated with any speaking qualities
in the study. Less important for current pedagogical design?
31. FLESHING OUT TYPOLOGIES
Pattern Characteristic Behaviors
Disregardful
Minimal attention to others’ posts (few posts viewed; short time
viewing). Brief and relatively infrequent sessions of activity in
discussions.
Coverage
Views a large proportion of others’ posts, but spends little time
attending to them (often only scanning the contents). Short but
frequent sessions of activity, focusing primarily on new posts.
*May be socially-oriented or content-driven.
Focused
Views a limited number of others’ posts, but spends substantial
time attending to them. Few extended sessions of activity in
discussions.
Thorough
Views a large proportion of other’s posts; spends substantial time
attending to many of them. Long overall time spent listening;
considerable revisitiation of posts already read.
33. TAKEAWAY
ATTENDING TO THE PEDAGOGICAL
CONTEXT OF DISCUSSION FORUM
USE AND CRAFTING THEORETICALLY
INFORMED METRICS LET US
EXTRACT EXPLANATORY AND
ACTIONABLE INFORMATION FROM
THE CLICKSTREAM DATA
35. CHALLENGES WE SET OURSELVES
FOR LOOKING AT THE MOOC DATA
LOOK AT SOCIAL INTERACTION
ADDRESS ISSUES OF SCALE
EMPLOY NATURAL LANGUAGE PROCESSING*
ATTEND TO PEDAGOGICAL CONTEXT
WORK IN A THEORY-INFORMED WAY
36. SOCIAL INTERACTION IN MOOC FORUMS
ST RONG PRED IC TOR OF PERSIST ENCE BUT T HIS M AY B E
BECAUS E IT IND EXES (RAT HER T HAN CAUS ES )
ENG AG EMENT – WHAT ABOUT L EARNING ?
CL AIM ED TO PROVID E CRIT ICAL S OCIAL L EARNING
S UPPORT BUT WIT HOUT T IES TO T HE ACAD EMIC
CONT ENT, S OCIABIL IT Y MAY NOT IMPAC T L EARNING
[ K U H , 2 0 0 2 ; W I S E , D E L V A L L E , C H A N G & D U F F Y, 2 0 0 4 ]
ONLY A S MAL L % PART ICIPATE BUT T HIS IS NOT
SURPRISING IF IT IS NOT D ES IG NED INTO A COURSE .
HOW PEOPL E PART ICIPATE IS AS IMPORTANT AS IF T HEY
D O S O.
37. WHY FOCUS ON LEARNING NOT ATTRITION?
S T R O N G N E E D TO R E C O N C E P T U A L I S E P E R S I S T E N C E A N D
AT T R I T I O N I N M O O C S G I V E N T H E N U M B E R O F P E O P L E W H O
R E G I S T E R W / O “A N I N F O R M E D C O M M I T M E N T TO C O M P L E T E
T H E C O U R S E ”
[ D E B O E R , H O , S T U M P & B R E S L O W , 2 0 1 4 ]
G R E AT VA R I E T Y I N I N T E N T I O N S , W O R K I N G PAT T E R S ,
R E S O U R C E S U S E D, S E Q U E N C E A N D F R E Q U E N C Y O F U S E
[ D E B O E R E T A L . , 2 0 1 4 ; K I Z I L C E C , P I E C H & S C H N E I D E R , 2 0 1 3 ]
J U S T I N D E X I N G L E V E L O F E N G A G E M E N T TO P R E D I C T W H O
W I L L S TO P PA R T I C I PAT I N G D O E S N ’ T T E L L U S W H Y O R H O W
TO I N T E R V E N E
L E A R N I N G M AY B E O C C U R R I N G E V E N F O R T H O S E W H O D O N ’ T
E V E N T U A L LY C O M P L E T E
38. WHY FOCUS ON LEARNING NOT ATTRITION?
S T R O N G N E E D TO R E C O N C E P T U A L I S E P E R S I S T E N C E A N D
AT T R I T I O N I N M O O C S G I V E N T H E N U M B E R O F P E O P L E W H O
R E G I S T E R W / O “A N I N F O R M E D C O M M I T M E N T TO C O M P L E T E
T H E C O U R S E ”
[ D E B O E R , H O , S T U M P & B R E S L O W , 2 0 1 4 ]
G R E AT VA R I E T Y I N I N T E N T I O N S , W O R K I N G PAT T E R S ,
R E S O U R C E S U S E D, S E Q U E N C E A N D F R E Q U E N C Y O F U S E
[ D E B O E R E T A L . , 2 0 1 4 ; K I Z I L C E C , P I E C H & S C H N E I D E R , 2 0 1 3 ]
J U S T I N D E X I N G L E V E L O F E N G A G E M E N T TO P R E D I C T W H O
W I L L S TO P PA R T I C I PAT I N G D O E S N ’ T T E L L U S W H Y O R H O W
TO I N T E R V E N E
L E A R N I N G M AY B E O C C U R R I N G E V E N F O R T H O S E W H O D O N ’ T
E V E N T U A L LY C O M P L E T E
“I'm very happy to be in this course. I [couldn’t] finish it on
time, but I think I have learnt a lot. Thank you Prof X, you
are a great teacher, very [professional], excellent in many
ways. I will miss you!”
39. SOCIAL INTERACTION IN MOOC FORUMS
ST RONG PRED IC TOR OF PERSIST ENCE BUT T HIS M AY B E
BECAUS E IT IND EXES (RAT HER T HAN CAUS ES )
ENG AG EMENT – WHAT ABOUT L EARNING ?
CL AIM ED TO PROVID E CRIT ICAL S OCIAL L EARNING
S UPPORT BUT WIT HOUT T IES TO T HE ACAD EMIC
CONT ENT, S OCIABIL IT Y MAY NOT IMPAC T L EARNING
[ K U H , 2 0 0 2 ; W I S E , D E L V A L L E , C H A N G & D U F F Y, 2 0 0 4 ]
ONLY A S MAL L % PART ICIPATE BUT T HIS IS NOT
SURPRISING IF IT IS NOT D ES IG NED INTO A COURSE .
HOW PEOPL E PART ICIPATE IS AS IMPORTANT AS IF T HEY
D O S O.
40. FRAMING QUESTIONS
WHAT WAS THE PEDAGOGICAL PURPOSE / DESIGN OF
THE DISCUSSION FORUMS IN THE PSYCH MOOC ?
BASED ON THIS, WHAT WERE THEORETICALLY
DESIRABLE PAT TERNS OF PARTICIPATION ?
HOW CAN THESE BEST BE PROXIED BY THE AVAILABLE
DATA?
HOW COULD THE DESIRED PAT TERNS BE BET TER
SUPPORTED ?
41. MOOC
PEDAGOGICAL CONTEXT
• Course Topic
– Introductory Psychology
• Level and Expected Background
– Designed for college freshmen
– Equivalent of high school education expected
– No specific prior knowledge indicated
• Course Design
– Video lectures (8-15 min long)
– Readings from OLI (Open Learning Initiative) online textbook
– Weekly timed multiple choice quiz
– Final exam at the end of the course
42. WHAT ABOUT THE
DISCUSSION FORUMS?
• Optional part of the course, main pedagogical design a Q&A
forum to ask and answer questions about course material
Communication
“There will be a Q&A forum where you can post your questions
about the course. Students will have the opportunity to "vote up"
questions they want answered, and the questions with the most
votes will be answered either in a forum post or a video.”
….
Expectations
“Participants are expected to seek help if needed from your fellow
students by using the forums”
43. RECREATED
STUDENT FORUM VIEW
Forums
Welcome to the course discussion forums.
Sub-forum Activity
General Discussion
Discuss general aspects of the course.
Q&A
Ask and answer questions about course material.
Assignments
Discuss details of the course assignments.
Technical Issues
Post any issues with, or questions about, technical aspects of the course
website (trouble with video playback, broken links, etc.).
OLI Textbook Questions
Post any issues with, or questions about, technical aspects of the OLI Textbook.
Student Bios
Introduce yourself and learn about other students.
44. RECREATED
STUDENT FORUM VIEW
Notes: [1] Counts only include non-deleted posts/threads
[2] Counts taken prior to data cleaning, may include duplicate or nonsense posts
Forums
Welcome to the course discussion forums.
Sub-forum Activity
Threads (Posts + Comments)
General Discussion
Discuss general aspects of the course.
289 (1341+804)
Q&A
Ask and answer questions about course material.
158 (525+204)
Assignments
Discuss details of the course assignments.
147 (827+775)
Technical Issues
Post any issues with, or questions about, technical aspects of the course
website (trouble with video playback, broken links, etc.).
108 (318+79)
OLI Textbook Questions
Post any issues with, or questions about, technical aspects of the OLI Textbook.
99 (347+106)
Student Bios
Introduce yourself and learn about other students.
662 (1614+354)
45. PROCESS CHECK
DOES IT MAKE SENSE TO USE LISTENING AND SPEAKING
THEORY IN THIS CONTEXT ?
• PEDAGOGICAL CHALLENGE
– M O S T O F T H E D I S C U S S I O N I S N ’ T C O N T E N T R E L AT E D,
L I S T E N I N G I S N ’ T E X P E C T E D TO R E L AT E TO L E A R N I N G
• TECHNICAL CHALLENGE
– LO W G R A N U L A R I T Y D ATA ( “ V I E W. F O R U M ” &
“ V I E W.T H R E A D ” V S . “ V I E W. P O S T ”, T H O U G H “ V OT E . U P ”
N O W AVA I L A B L E )
• PRACTICAL CHALLENGE
– M A N Y T H R E A D S I N Q & A F O R U M N OT A C T U A L LY C O N T E N T
46. CHANGING TRACKS
A WHOLE BUNCH OF QUESTIONS WE THOUGHT WE
WERE GOING TO ASK WENT OUT THE WINDOW…
• NEW (VERY BASIC ) FOCUS ON IF THE PAT TERNS OF
FORUM USE MATCHED THOSE DESIRED FOR THE
INTENDED PURPOSE
– D I D S T U D E N T S U S E T H E Q & A F O R U M TO A S K
Q U E S T I O N S A B O U T T H E C O U R S E M AT E R I A L ?
– D I D T H E I N S T R U C TO R S R E P LY TO ( T H E H I G H E S T
V OT E D ) Q U E S T I O N S A B O U T T H E C O U R S E M AT E R I A L ?
47. DID STUDENTS USE THE Q&A
FORUM TO ASK QUESTIONS
ABOUT THE COURSE MATERIAL?
• After preliminary inspection we decided to code both the Q&A and
General Discussion (GD) forums b/c no clear fxnl difference was seen
• Two raters coded the starting post in each thread as either
– Content [C] (Asking questions about course material, expanding on
course content; discussing a resource shared)
– Non-Content [X] (Including logistics, social, study group formation and
link sharing)
• 439 of 447 total threads coded
– 8 removed for foreign language or complete nonsense contents
– 92% agreement (k=0.81), All difference reconciled, rule of leniency
Image: So Many MOOCs by mksmith23, CC by 2.0 license
48. DID STUDENTS USE THE Q&A
FORUM TO ASK QUESTIONS
ABOUT THE COURSE MATERIAL?
Content Threads Non-Content Threads
General Discussion 55 226
Q&A 68 90
Total 123 (28%) 316 (72%)
Image: So Many MOOCs by mksmith23, CC by 2.0 license
49. DID THE INSTRUCTORS REPLY TO
(THE HIGHEST VOTED) QUESTIONS
ABOUT THE COURSE MATERIAL?
• First approach: “Instructor Replied” label [problematic]
• 2 “official” Instructor IDs (threads automatically labelled)
– Course Professor [2XXXXX4]
– Course TA [5XXXX1]
• 1 “unofficial” Instructor ID (threads not automatically labelled)
– Course Professor [2XXXXX0]
“[Yes,] I really am NAME2XXXXX0 (XXXX is my first name) and
am the instructor for the course. I've been at UNIVERSITY for 43
years and love teaching. This course was a challenge because
there was no feedback from students when the modules were
being taped. The lack of student interaction is the real
challenge of a MOOC. Just looking at a camera is a very
different context than looking at a classroom of bright
students. NAME2XXXXX0, Instructor”
Image: So Many MOOCs by mksmith23, CC by 2.0 license
50. Forum Threads
Instructor
Replied
(All 3 IDs)
Content
Threads
Replied by
instructor
% of Instructor
Replies
Directed at
Content
General
Discussion
289 31 (11%) 55 5 (9%) 16%
Q&A 158 31 (20%) 68 17 (25%) 55%
Total 447 62 (14%) 123 22 (18%) 35%
DID THE INSTRUCTORS REPLY TO
(THE HIGHEST VOTED) QUESTIONS
ABOUT THE COURSE MATERIAL?Image: So Many MOOCs by mksmith23, CC by 2.0 license
51. Instructor Replied
(62 threads)
Non-Replied
(377 threads)
Average # (range) of votes 2.6 (0 to 30) 1.7 (-13 to 45)
Av # (range) of
posts+comments 8.6 (2-110) 6.2 (1-92)
Av # (range) views 110 (25-1185) 75 (5-1143)
Content Threads Non-Content Threads
Av # votes 1.2 2.1
Av # posts+comments 4.1 7.5
Av # views 51 91
DID THE INSTRUCTORS REPLY TO
(THE HIGHEST VOTED) QUESTIONS
ABOUT THE COURSE MATERIAL?Image: So Many MOOCs by mksmith23, CC by 2.0 license
52. A CORE CHALLENGE FOR
SOCIAL INTERACTION AT SCALE
• Too much quantity, not enough quality
• Students get lost / overwhelmed in the abundance of
communication
• Instructors too, challenging to find where their input is
needed
• A need to separate “the wheat from the chaff”
Image: So Many MOOCs by mksmith23, CC by 2.0 license
53. CAN NATURAL LANGUAGE
PROCESSING HELP?
• Goal to support the instructor in finding content threads
more efficiently in the forums to be able to respond and
facilitate learning
• A modest attempt to build a proof-of-concept model
– Feature extraction performed with basic bag-of-words feature set
(inc. bigrams, trigrams and parts-of-speech tagging), rare
threshold of 5
– Unigrams and bigrams alone most useful to characterize and
model posts
– Total of 1573 features extracted
Image: So Many MOOCs by mksmith23, CC by 2.0 license
54. CHARACTERISTIC FEATURES
Feature Kappa
but 0.25
more 0.24
by 0.24
in_the 0.24
why 0.24
as 0.23
what 0.22
is 0.22
that 0.21
or 0.20
then 0.20
in 0.19
when 0.18
of_the 0.17
of 0.17
Feature Kappa
and_the 0.16
question 0.16
between 0.16
age 0.16
correct 0.16
than 0.16
were 0.15
by_the 0.15
an 0.15
answer 0.15
does 0.14
mental 0.14
research 0.14
to_the 0.14
their 0.14
Feature Kappa
course 0.12
i 0.09
my 0.08
this_course 0.07
final 0.06
BOL_i 0.06
quiz 0.06
the_course 0.06
exam 0.05
thanks 0.05
i_am 0.05
videos 0.04
grade 0.04
certificate 0.04
BOL_hi 0.04
Feature Kappa
BOL_hello 0.04
hello 0.04
everyone 0.04
will 0.04
final_exam 0.04
i_just 0.04
hi 0.04
i_can 0.04
find 0.04
coursera 0.03
courses 0.03
i_have 0.03
grades 0.03
material 0.03
quizzes 0.03
Content Threads Non-Content Threads
Image: So Many MOOCs by mksmith23, CC by 2.0 license
55. PREDICTING CONTENT POSTS
• Procedure
– Algorithm: Support Vector Machines
– Setting for Nominal Class Values: LibLINEAR
– Cross-validation, 10 randomly generated folds
• Results
– Best Model Accuracy/Kappa = 0.86/0.64
– Recall = 0.71 (False Neg. rate = 0.29)
– Precision = 0.76
Image: So Many MOOCs by mksmith23, CC by 2.0 license
56. STANDARD FORUM INDICATORS
DON’T HELP IDENTIFY
CONTENT
Accuracy Kappa Recall Precision
Base Model 0.86 0.64 0.71 0.76
Addition of Standard Forum Indicators
# votes 0.84 0.60 0.68 0.74
# posts 0.85 0.62 0.69 0.75
# views 0.85 0.62 0.69 0.76
Image: So Many MOOCs by mksmith23, CC by 2.0 license
57. INSTRUCTOR PERSPECTIVE
W/o Content Model
(Default)
With Content Model
Total Number of Potential
Content Threads to Read
439
(37/wk on av)
114
(10/wk on av)
Percent of Threads Actually
About Course Content 28% 76%
Percent of Content Threads
With Instructor Replies 18% >18%?
Percent of Instructor
Replies Addressing Content 35% >35%?
Image: So Many MOOCs by mksmith23, CC by 2.0 license
58. STRENGTHS, LIMITATIONS &
FUTURE OPPORTUNITIES
• Design of forums can improve, but unexpected use will still happen.
For instructors to facilitating learning, the first step is to locate where
learning opportunities are happening – content modeling can help.
– Aligns well with Coursera’s development of content / logistics TAs.
• Model is simple but useful, more sophisticated modelling can
improve these results.
• Model built with only 439 starting posts, including all the posts could
lead to both better prediction of if a post is content-related and
more nuanced assessment of threads (e.g. “This thread is estimated
to have 87% content-related posts)
• Model seemed not to draw heavily on domain-specific vocabulary
but may rely on domain-specific discourse types (extensibility to
other social sciences but perhaps not humanities / hard sciences)
Image: So Many MOOCs by mksmith23, CC by 2.0 license
59. TAKEAWAY
ATTENDING TO THE PEDAGOGICAL
CONTEXT OF DISCUSSION FORUM USE AND
GETTING CLOSE TO THE DATA LET US
DEVELOP A SIMPLE YET APPROPRIATE AND
USEFUL MODEL - SUPPORTING CONTENT
RELATED LEARNING DISCUSSION MAY BE
PRE-REQUISITE TO STUDYING MORE
COMPLEX FACETS OF INTERACTION LARGE
SCALE LEARNING ENVIRONMENTS
60. A DATA ARCHEOLOGY APPROACH
THAT PAYS ATTENTION TO THE
LEARNING “CIVILIZATION” THAT
CREATED THE DATA AND POSITS
THEORY-INFORMED PATTERNS OF
BEHAVIOR CAN HELP US BETTER
UNDERSTAND AND SUPPORT SOCIAL
INTERACTION IN LARGE SCALE
LEARNING ENVIRONMENTS
CONCLUSION
61. DATA ARCHEOLOGY
Image Credit: Pedro Szekely via Flickr (CC BY 2.0), adapted
A T H E O R Y - I N F O R M E D A P P R O A C H T O A N A LY Z I N G D ATA
T R A C E S O F S O C I A L I N T E R A C T I O N I N L A R G E S C A L E
L E A R N I N G E N V I R O N M E N T S
ALYSSA WISE
S I M O N F R A S E R U N I V E R S I T Y