SlideShare una empresa de Scribd logo
1 de 56
Descargar para leer sin conexión
Sergey Sosnovsky
What’s in a textbook?
Architecture of an AES
Instructional
Content
Interaction
User
Model
0..1..1.
.0..1..1
..Adaptation
Model
Adaptation
M
e
t a d a t a
Domain
Model
2
2
Math-Bridge: Rich Adaptive and Intelligent Textbooks
Seite/Page 3
Sosnovsky, S., Dietrich, M., Andrès, E., Goguadze, G., Winterstein, S., Libbrecht, P., Siekmann, J., & Melis, E. (2014). Math-Bridge: Bridging the gaps in European remedial mathematics
with technology-enhanced learning. In T. Wassong, D. Frischemeier, P. R. Fischer, R. Hochmuth, & P. Bender (Eds.), Mit Werkzeugen Mathematik und Stochastik lernen – Using Tools for
Learning Mathematics and Statistics (pp. 437-451). Berlin/Heidelberg, Germany: Springer.
Intelligent Problem Solving Support
4
Personalized Course Generation
5
Adaptive Navigation
6
Metadata annotation
Metadata
annotation
error-
prone
time-
consu-
ming
limited
support
of tools
often
many
authors
often non
expert
authors
difficult
Seite/Page 7
•Math-Bridge metadata
schema has more
than 30 elements
•Math-Bridge content
collection contains
more than 10 000
learning objects
•About 50 people were involved
in preparing this collection
The Burden of Authoring
§Learning content authoring has always been Tedious, Expertise
demanding, Poorly supported
§Content & Knowledge authoring for Adaptive Intelligent Systems
requires a lot of extra efforts
§!!! Information & Knowledge existing in the system should become
not the authoring burden but the vehicle for authoring support !!!
Seite/Page 8
Instructional
Content
Authoring for
e-Learning
Metadata
InstructionalContent
Authoring for Adaptive e-
Learning
Instructional
Content
Authoring for Adaptive e-
Learning
as It Should Be
Semantic Gap Detection
F O U R M A I N S T E P S :
Conversion of Metadata to OWL2
Detection of Ontology Inconsistencies
Isolation of Causing Axioms
Generation of Verbal Explanations
Seite/Page 9
Sosnovsky, S., & Alpizar-Chacon, I. (2014). Semantic gap detection in metadata of adaptive learning environments. In Proceedings of ICALT'2014: 14th International
Conference on Advanced Learning Technologies (pp. 548-552). IEEE Computer Society.
Math-Bridge Metadata Schema
Seite/Page 10
Step 1: Conversion of Metadata to OWL2
OWL2
XSLT
Stylesheet
OMDoc
Seite/Page 11
Step 2: Detection of Ontology Inconsistencies
rdfs:domain rdfs:range
owl:ObjectProperty activemath:
hasDomainPrerequisite
intro_bikers_slope
activemath:Text
rdf:type
activemath:
KnowledgeItem
ex_tour_de_fr
activemath:Example
rdf:type
activemath:ConceptItem
activemath:
SateliteItem
Inconsistent!
Seite/Page 12
Step 3: Isolation of Causing Axioms
Seite/Page 13
Step 4: Generation of Verbal Explanations
Seite/Page 14
The Scale of the Problem
Interaction
Adaptation
15
Textbooks as a source of (extractable) knowledge
• Focus (narrow, cohesive domain)
• Quality (created by domain experts)
• Purpose (content explains domain knowledge to a novice)
16
• sections / subsectionsStructure
• easy to complexOrder
• ..of content and headersFormatting
• indices
• tables of content
Additional
structural
elements
•Underlying content
•Textual Labels
Topics/subtopics
•Prerequisites <-> outcomes
Pedagogical
relations
•header vs important vs regular
•same format = same role
Text types/roles
and relations
•Glossary of curated meaningful terms
•Set of important domain categories
Meaningful labels
• If automatically extracted and formally represented
these elements will form the model of the textbook and
the model of the domain as the author understands it
Linking Textbooks to Ontologies
17
Topic-based model of an HTML-based Java
textbook automatically extracted and mapped
to a central ontology already linked to a set of
Java exercises
• Mapping serves as a bridge to jointly
interpret learner’s reading and exercise
attempts in terms of ontology and adapt
access to textbook pages accordingly
Project 1 1.Sosnovsky, S., Hsiao, I-H., & Brusilovsky, P. (2012). Adaptation “in the wild”: Ontology-based personalization of open-corpus learning material. In A. Ravenscroft, S.
Lindstaedt, C. Delgado Kloos, & D. Hernández-Leo (Eds.), Proceedings of EC-TEL'2012: 7th European Conference on Technology Enhanced Learning (pp. 425-431).
Berlin/Heidelberg, Germany: Springer.
Linking Textbooks to Textbooks
Several LDA-based techniques are used to interlink
sections from a set of HTML-based textbooks in a
domain
A manual mapping by experts is used as a golden
standard
19
Linking
Linking
Project 2
Guerra, J., Sosnovsky, S., & Brusilovsky, P. (2013). When one textbook is not enough: Linking multiple textbooks using probabilistic topic models. In D.
Hernández-Leo, T. Ley, R., Klamma, & A. Harrer (Eds.), Proceedings of EC-TEL'2013: 8th European Conference on Technology Enhanced Learning (pp.
125-138). Berlin/Heidelberg, Germany: Springer.
Interlingua: linking textbooks
across languages
Statistics
ontology
....
....
....
!
Semantic
model of the
textbook
Project 3
DE
Chapter1
Section1.1
Subsection1.1.1
Subsection1.1.2
…
Section1.2
Subsection1.2.2
…
term -> page#
term -> page#
term -> page#
term -> page#
term -> page#
…. ....
....
....
EN
....
....
....
FR
Alpizar-Chacon, I., van der Hart, M., Wiersma, Z., Theunissen, L., & Sosnovsky, S. (2020). Transformation of PDF Textbooks into Interactive Educational
Resources. In Proceedings of the Workshop on Intelligent Textbooks at AIEd'2020 (pp. 4-16). Online, July 6, 2020.
Relevant Content in One’s Mother Tongue
Project 3
intextbooks
Isaac Alpizar Chacon
Alpizar-Chacon, I., & Sosnovsky, S.(2020). Knowledge models from PDF textbooks. New Review of Hypermedia and Multimedia, (in press).
Model extraction from PDF
textbooks
24
PDF as the most common
and challenging format
4 stages 9 steps 39 rules
Alpizar-Chacon, I., & Sosnovsky, S. (2020). Order out of Chaos: Construction of Knowledge Models from PDF Textbooks. In Proceedings of
DocEng’2020: The 20th ACM Symposium on Document Engineering, (Article No.: 8, pp 1–10). New York, NY, USA: ACM Press.
25
Example Rule
• REPEATED_LINES:
1. Select a sample of pages: 𝑃 𝑠 = {𝑝𝑎 , 𝑝𝑏 , . . . , 𝑝𝑚 } | 𝑃𝑠 ⊂ 𝑃.
2. If the first line(s) are identical across 𝑃 𝑠 : header is detected and removed in
all pages 𝑝 ∈ 𝑃.
3. If the last line(s) are identical across 𝑃 𝑠 : footer is detected and removed in
all pages 𝑝 ∈ 𝑃.
2. Role labeling of fragments
Style 1
Style 2
Style 3
Style 4
Style Font Family Font Size Font Face Font Color Occurrences
1 Liberation Sans 35 Bold Blue 3
2 Liberation Sans 18 Bold Blue 1
3 Liberation Sans 9 - Black 153
4 Liberation Sans 9 Bold Black 2
=> Body text
Chapter
Subchapter
2. Role labeling of fragments
3. Processing Table of Contents
TOC Section
Textbook Part
Chapter
Subchapter
Subchapter level 2
Subchapter
Subchapter
Subchapter
Chapter
.
.
.
.
.
.
.
.
.
.
.
Individual page
numbers for each
section
Subchapter level 2
Subchapter level 2
Subchapter level 2
Subchapter level 2
Subchapter level 2
3. Processing Table of Contents
3. Processing Index
Multi-column layout
Index Section
Index term + page number
"see" case
Multiline term
Nested Term
Range of page numbers
Reading order =
3. Processing Index
32
Structure
(sections)
Content (words,
lines, etc.)
Domain
Knowledge
(terms)
4. Textbook model
Potential Problems of These Models
• Structure
• Labels
• Order
• Focus
• Coverage
Variability
33
• Same domain + Different authors =
Different textbooks =>
Different models
Subjectivity
• Completeness
• Granularity
• Consistency
Quality
• More structure than knowledge
• Lack of links
• Cohesiveness of topics and index terms
Lack of
semantics
Textbook-levelModel-level
..nevertheless
• They are automatically extracted models of high-
quality resources and underlying domains
• Their individual quality might be not enough, but
they can be aggregated
• Linking models to the existing ontologies should help
filter our less relevant terms and extend them with
additional semantical information
• Interlinking multiple models within the same domain
should improve the coverage
34
35
Evaluation 1 (Accuracy of model extraction)
Domains: Statistics, Computer Science, History, Literature
36
Evaluation 1 (Accuracy of model extraction): Results
Averages over all domains
Text
Extraction
Our approach:
93.85%
PDFBox:
89.72%
PdfAct:
84.19%
TOC
Recognition
Precision:
99.92%
Recall:
99.92%
Index
Recognition
Precision:
98.56%
Recall:
98.13%
37
Evaluation 2 (Value of Extracted Models – Semantic Linking of Textbooks)
Book#1
Chap1
Sub1
Sub2
Chap2 Chap3
Book#2
Chap1
Sub1
Sub2
Sub3
Chap2 Chap3
Sub1
Sub2
Chap4
Book#1
Chap1
Sub1
Sub2
Chap2 Chap3
Book#2
Chap1
Sub1
Sub2
Sub3
Chap2 Chap3
Sub1
Sub2
Chap4
38
Evaluation 2 (Value of Extracted Models – Semantic Linking of Textbooks): Method
• Ground truth
• Average of manual linking of two textbooks by three experts in statistics
• Measure:
• NDCG (normalized discounted cumulative gain) at 1, 3, and 5.
• Baselines:
• TFIDF model
• LDA model
39
0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
NDCG@1 NDCG@3 NDCG@5
TFIDF LDA TFIDF+LDA Our model
Evaluation 2 (Value of Extracted Models – Semantic Linking of Textbooks): Results
Model linking to
40
Alpizar-Chacon, I., & Sosnovsky, S. (2019). Expanding the Web of Knowledge: One Textbook at a Time. In Proceedings of ACM Hypertext’2019: 30th
International Conference on Hypertext and Social Media (pp. 9-18). New York, NY, USA: ACM Press.
1. Construction of the Glossary
41
a) Index parsing
b) Term recognition
c) Glossary creation
• Preparation for the next phase
D
..
Distribution
Gamma
Normal
…
Index Glossary terms
(with candidate labels)
Distribution 85
Gamma Distribution 106
Distribution Gamma
Normal Distribution 92
Distribution Normal
92
106
Distribution 85
Gamma Distribution 106
Normal Distribution 92
• We use index terms to query DBpedia => find matching resources
• DBpedia resources can have categories (e.g. Statistics)
• Categories form hierarchy (e.g., Statistics / Statistical_models / ...)
• In the beginning, we select the target top category (define the domain)
• The algorithm looks 2 more levels deeper
• This is the only manual input required
• If a query retrieves only 1 DBpedia resource and
it belongs to one of target categories (dct:subject)
this resource becomes the part of the core set
• dbo:abstract’s of all core set resources are concatenated to form domain
context (used at Step 2.c)
2.a Core set construction
42
2.b Candidate set construction
• If a query retrieves several DBpedia resources
they form the candidate set of the term
• Context is gathered for every candidate resource:
• dbo:abstract of this resource +
• dbo:abstract’s of all resources linked to it
• Context helps during the next step
43
2.c Resource disambiguation
• For each resource from a candidate set
• Cosine similarity is computed between
the context of the resource and
the domain context
• The resource with the highest cosine similarity (and > threshold) is
matched to the term
• Newly obtained resources help to extend the domain context
• Step 2.3 repeats until no more new terms can be matched
44
3. Model Enrichment
• Abstract
• Wikipedia link
• Categories
• Relation to other terms
• Multilingual information
• …
45
In statistics, the standard
score is the (signed) number
of standard deviations an
observation or…
standard
score
En probabilités et statistiques,
une variable centrée réduite
est une variable aléatoire…
Unter Standardisierung oder
z-Transformation versteht
man in der mathematischen
Statistik eine …
Statistical
Ratios
http://en.wikipedia.org/wiki/Standard_score
dct:subject
FR
DE
EN
t-statistics dct:subject
……
yago:WikicatStatisticalRatios rdf:type
4603-12-2020
TEI Textbook Model
Structure
(sections)
Content (words, lines,
titles, etc)
Domain
Knowledge
(terms)
+ RDFa
attributes
Evaluation: Linking to DBpedia
• Question: Are the index terms linked to the right DBpedia
resources?
• Task: validate the resources disambiguation procedure
• BL1 (random baseline): a random resources in the candidates list
is selected as the right resource
• BL2 (default sense baseline): the most linked/popular resource in
the candidate list is selected as the right resource
• Ground truth was created manually
47
Statistics#1 Statistics#2 Information Retrieval
Evaluation: Aggregation of Models
• Question: Would aggregation of additional textbooks move the model closer
to the ideal domain model (all relevant resources)?
• Ground truth: constructed based on the Glossary of statistical terms
• > 1000 terms
• Task: compare the matching between textbooks and DBpedia with the “ideal”
matching between the Glossary and the DBpedia
48
Average single textbook Average 5 textbooks 10 textbooks
Transformation of PDF textbooks into
interactive HTML
Structure
(sections)
Content (words, lines,
titles, etc)
Domain Knowledge
(terms)
+ RDFa attributes
Alpizar-Chacon, I., van der Hart, M., Wiersma, Z., Theunissen, L., & Sosnovsky, S. (2020). Transformation of PDF Textbooks into Interactive Educational
Resources. In Proceedings of the Workshop on Intelligent Textbooks at AIEd'2020 (pp. 4-16). Onlines, July 6, 2020.
5003-12-2020
PDF to HTML converter
• Several open libraries available:
• pdf2htmlEX, PDFMiner, pdf2html, Xpdf, etc.
• pdf2htmlEX:
• preserves the layout perfectly across very different types of documents
• produces the same structure across different documents
• fast, stable, and scalable
5103-12-2020
TEI-HTML synchronizer
5203-12-2020
TEI-HTML synchronizer
5303-12-2020
Validation
Test the accuracy of the matching algorithm for the TEI-HTML synchronization
70 university-level textbooks
domains: statistics, computer
science, web programming,
literature, history
evaluation metric: percentage
of words that were matched
between the TEI and HTML
representations
Results: 87-90 %
Current Work (1):
Extraction of accurate domain models from textbook indices
• Index entries have different roles
(different domain specificity):
- introduce core domain terms
<hypotheses testing>
- introduce related domain terms
<factorial>, <sample space>
- serve various pedagogically purposes (examples, use-cases,
data, etc.)
<Euro coin>, <Bovine Spongiform Encephalopathy>
54
Current Work (1):
Extraction of accurate domain models from textbook indices
Approach:
1. Use DBPedia to infer the domain specificity of matched index terms
2. Utilise DBPedia structure (categories and resources) and associated
textual content
3. Integrate indices from multiple textbooks to discover a " better”
domain model
Domains:
1. Statistics
2. Classic Philosophy
55
Current Work (2):
From tables of contents to topics
• Add rules for filtering out non-topical sections / TOC entries
• Explore how hierarchy, order and labels of topics can help
domain model extraction
• Create a global table of contents of the domain from
multiple textbooks
• Personalised textbook generation
56
Current Work (3):
assessment generation
• Use the rich intextbooks models (structured textual content annotated
with domain models, linked to DBPedia, linked to other textbooks) to
• generate self-assessment questions on demand
• targeting a specific subset of the model/content
- adaptive assessment generation
57
Thank you!
https://github.com/intextbooks/ITCore
https://intextbooks.science.uu.nl
Contact:
Isaac Alpizar-Chacon <i.alpizarchacon@uu.nl>

Más contenido relacionado

La actualidad más candente

HAN_XU_ICDMW2014
HAN_XU_ICDMW2014HAN_XU_ICDMW2014
HAN_XU_ICDMW2014Han Xu, PhD
 
Simulation of Language Acquisition Walter Daelemans
Simulation of Language Acquisition Walter DaelemansSimulation of Language Acquisition Walter Daelemans
Simulation of Language Acquisition Walter Daelemansbutest
 
DODDLE-OWL: A Domain Ontology Construction Tool with OWL
DODDLE-OWL: A Domain Ontology Construction Tool with OWLDODDLE-OWL: A Domain Ontology Construction Tool with OWL
DODDLE-OWL: A Domain Ontology Construction Tool with OWLTakeshi Morita
 
NAACL2015 presentation
NAACL2015 presentationNAACL2015 presentation
NAACL2015 presentationHan Xu, PhD
 
An Approach to Automated Learning of Conceptual Graphs from Text
An Approach to Automated Learning of Conceptual Graphs from TextAn Approach to Automated Learning of Conceptual Graphs from Text
An Approach to Automated Learning of Conceptual Graphs from TextFulvio Rotella
 
Introduction to Distributional Semantics
Introduction to Distributional SemanticsIntroduction to Distributional Semantics
Introduction to Distributional SemanticsAndre Freitas
 
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categor...
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categor...A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categor...
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categor...Hiroshi Ono
 
Text smilarity02 corpus_based
Text smilarity02 corpus_basedText smilarity02 corpus_based
Text smilarity02 corpus_basedcyan1d3
 
Framester and WFD
Framester and WFD Framester and WFD
Framester and WFD Aldo Gangemi
 
ANOMALY DETECTION IN ARABIC TEXTS USING NGRAMS AND SELF ORGANIZING MAPS
ANOMALY DETECTION IN ARABIC TEXTS USING NGRAMS AND SELF ORGANIZING MAPSANOMALY DETECTION IN ARABIC TEXTS USING NGRAMS AND SELF ORGANIZING MAPS
ANOMALY DETECTION IN ARABIC TEXTS USING NGRAMS AND SELF ORGANIZING MAPSIJCSEA Journal
 
Anomaly Detection in Arabic Texts using Ngrams and Self Organizing Maps
Anomaly Detection in Arabic Texts using Ngrams and Self Organizing MapsAnomaly Detection in Arabic Texts using Ngrams and Self Organizing Maps
Anomaly Detection in Arabic Texts using Ngrams and Self Organizing MapsIJCSEA Journal
 
Workshop unpad2014 with ref
Workshop unpad2014 with refWorkshop unpad2014 with ref
Workshop unpad2014 with refLola Devung
 

La actualidad más candente (14)

HAN_XU_ICDMW2014
HAN_XU_ICDMW2014HAN_XU_ICDMW2014
HAN_XU_ICDMW2014
 
Ontology learning
Ontology learningOntology learning
Ontology learning
 
Simulation of Language Acquisition Walter Daelemans
Simulation of Language Acquisition Walter DaelemansSimulation of Language Acquisition Walter Daelemans
Simulation of Language Acquisition Walter Daelemans
 
DODDLE-OWL: A Domain Ontology Construction Tool with OWL
DODDLE-OWL: A Domain Ontology Construction Tool with OWLDODDLE-OWL: A Domain Ontology Construction Tool with OWL
DODDLE-OWL: A Domain Ontology Construction Tool with OWL
 
NAACL2015 presentation
NAACL2015 presentationNAACL2015 presentation
NAACL2015 presentation
 
An Approach to Automated Learning of Conceptual Graphs from Text
An Approach to Automated Learning of Conceptual Graphs from TextAn Approach to Automated Learning of Conceptual Graphs from Text
An Approach to Automated Learning of Conceptual Graphs from Text
 
Introduction to Distributional Semantics
Introduction to Distributional SemanticsIntroduction to Distributional Semantics
Introduction to Distributional Semantics
 
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categor...
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categor...A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categor...
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categor...
 
Zrm
ZrmZrm
Zrm
 
Text smilarity02 corpus_based
Text smilarity02 corpus_basedText smilarity02 corpus_based
Text smilarity02 corpus_based
 
Framester and WFD
Framester and WFD Framester and WFD
Framester and WFD
 
ANOMALY DETECTION IN ARABIC TEXTS USING NGRAMS AND SELF ORGANIZING MAPS
ANOMALY DETECTION IN ARABIC TEXTS USING NGRAMS AND SELF ORGANIZING MAPSANOMALY DETECTION IN ARABIC TEXTS USING NGRAMS AND SELF ORGANIZING MAPS
ANOMALY DETECTION IN ARABIC TEXTS USING NGRAMS AND SELF ORGANIZING MAPS
 
Anomaly Detection in Arabic Texts using Ngrams and Self Organizing Maps
Anomaly Detection in Arabic Texts using Ngrams and Self Organizing MapsAnomaly Detection in Arabic Texts using Ngrams and Self Organizing Maps
Anomaly Detection in Arabic Texts using Ngrams and Self Organizing Maps
 
Workshop unpad2014 with ref
Workshop unpad2014 with refWorkshop unpad2014 with ref
Workshop unpad2014 with ref
 

Similar a What's in a textbook

Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic MinerAutomatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic MinerFrancesco Osborne
 
Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...
Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...
Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...Sergey Sosnovsky
 
A Text Mining Research Based on LDA Topic Modelling
A Text Mining Research Based on LDA Topic ModellingA Text Mining Research Based on LDA Topic Modelling
A Text Mining Research Based on LDA Topic Modellingcsandit
 
A TEXT MINING RESEARCH BASED ON LDA TOPIC MODELLING
A TEXT MINING RESEARCH BASED ON LDA TOPIC MODELLINGA TEXT MINING RESEARCH BASED ON LDA TOPIC MODELLING
A TEXT MINING RESEARCH BASED ON LDA TOPIC MODELLINGcscpconf
 
Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)fridolin.wild
 
Mining Users Rare Sequential Topic Patterns from Tweets based on Topic Extrac...
Mining Users Rare Sequential Topic Patterns from Tweets based on Topic Extrac...Mining Users Rare Sequential Topic Patterns from Tweets based on Topic Extrac...
Mining Users Rare Sequential Topic Patterns from Tweets based on Topic Extrac...IRJET Journal
 
Topic Extraction on Domain Ontology
Topic Extraction on Domain OntologyTopic Extraction on Domain Ontology
Topic Extraction on Domain OntologyKeerti Bhogaraju
 
The Design and Analysis of Computer Algorithms [Aho, Hopcroft & Ullman 1974-0...
The Design and Analysis of Computer Algorithms [Aho, Hopcroft & Ullman 1974-0...The Design and Analysis of Computer Algorithms [Aho, Hopcroft & Ullman 1974-0...
The Design and Analysis of Computer Algorithms [Aho, Hopcroft & Ullman 1974-0...YanNaingSoe33
 
Order out of Chaos: Construction of Knowledge Models from PDF Textbooks
Order out of Chaos: Construction of Knowledge Models from PDF TextbooksOrder out of Chaos: Construction of Knowledge Models from PDF Textbooks
Order out of Chaos: Construction of Knowledge Models from PDF TextbooksIsaac Alpizar-Chacon
 
Data integration with a façade. The case of knowledge graph construction.
Data integration with a façade. The case of knowledge graph construction.Data integration with a façade. The case of knowledge graph construction.
Data integration with a façade. The case of knowledge graph construction.Enrico Daga
 
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasksTopic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasksLeonardo Di Donato
 
9 26-fit-presentation
9 26-fit-presentation9 26-fit-presentation
9 26-fit-presentationmath267
 
Computing with Directed Labeled Graphs
Computing with Directed Labeled GraphsComputing with Directed Labeled Graphs
Computing with Directed Labeled GraphsMarko Rodriguez
 
A Document Exploring System on LDA Topic Model for Wikipedia Articles
A Document Exploring System on LDA Topic Model for Wikipedia ArticlesA Document Exploring System on LDA Topic Model for Wikipedia Articles
A Document Exploring System on LDA Topic Model for Wikipedia Articlesijma
 
Automatic Annotation Of Incomplete And Scattered Bibliographical References I...
Automatic Annotation Of Incomplete And Scattered Bibliographical References I...Automatic Annotation Of Incomplete And Scattered Bibliographical References I...
Automatic Annotation Of Incomplete And Scattered Bibliographical References I...Katie Naple
 
TOPIC EXTRACTION OF CRAWLED DOCUMENTS COLLECTION USING CORRELATED TOPIC MODEL...
TOPIC EXTRACTION OF CRAWLED DOCUMENTS COLLECTION USING CORRELATED TOPIC MODEL...TOPIC EXTRACTION OF CRAWLED DOCUMENTS COLLECTION USING CORRELATED TOPIC MODEL...
TOPIC EXTRACTION OF CRAWLED DOCUMENTS COLLECTION USING CORRELATED TOPIC MODEL...ijnlc
 
Model of semantic textual document clustering
Model of semantic textual document clusteringModel of semantic textual document clustering
Model of semantic textual document clusteringSK Ahammad Fahad
 
SFScon18 - Gabriele Sottocornola - Probabilistic Topic Models with MALLET
SFScon18 - Gabriele Sottocornola - Probabilistic Topic Models with MALLETSFScon18 - Gabriele Sottocornola - Probabilistic Topic Models with MALLET
SFScon18 - Gabriele Sottocornola - Probabilistic Topic Models with MALLETSouth Tyrol Free Software Conference
 
The Geometry of Learning
The Geometry of LearningThe Geometry of Learning
The Geometry of Learningfridolin.wild
 

Similar a What's in a textbook (20)

Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic MinerAutomatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
 
Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...
Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...
Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...
 
A Text Mining Research Based on LDA Topic Modelling
A Text Mining Research Based on LDA Topic ModellingA Text Mining Research Based on LDA Topic Modelling
A Text Mining Research Based on LDA Topic Modelling
 
A TEXT MINING RESEARCH BASED ON LDA TOPIC MODELLING
A TEXT MINING RESEARCH BASED ON LDA TOPIC MODELLINGA TEXT MINING RESEARCH BASED ON LDA TOPIC MODELLING
A TEXT MINING RESEARCH BASED ON LDA TOPIC MODELLING
 
Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)
 
Mining Users Rare Sequential Topic Patterns from Tweets based on Topic Extrac...
Mining Users Rare Sequential Topic Patterns from Tweets based on Topic Extrac...Mining Users Rare Sequential Topic Patterns from Tweets based on Topic Extrac...
Mining Users Rare Sequential Topic Patterns from Tweets based on Topic Extrac...
 
Topic Extraction on Domain Ontology
Topic Extraction on Domain OntologyTopic Extraction on Domain Ontology
Topic Extraction on Domain Ontology
 
The Design and Analysis of Computer Algorithms [Aho, Hopcroft & Ullman 1974-0...
The Design and Analysis of Computer Algorithms [Aho, Hopcroft & Ullman 1974-0...The Design and Analysis of Computer Algorithms [Aho, Hopcroft & Ullman 1974-0...
The Design and Analysis of Computer Algorithms [Aho, Hopcroft & Ullman 1974-0...
 
Order out of Chaos: Construction of Knowledge Models from PDF Textbooks
Order out of Chaos: Construction of Knowledge Models from PDF TextbooksOrder out of Chaos: Construction of Knowledge Models from PDF Textbooks
Order out of Chaos: Construction of Knowledge Models from PDF Textbooks
 
Data integration with a façade. The case of knowledge graph construction.
Data integration with a façade. The case of knowledge graph construction.Data integration with a façade. The case of knowledge graph construction.
Data integration with a façade. The case of knowledge graph construction.
 
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasksTopic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasks
 
9 26-fit-presentation
9 26-fit-presentation9 26-fit-presentation
9 26-fit-presentation
 
Computing with Directed Labeled Graphs
Computing with Directed Labeled GraphsComputing with Directed Labeled Graphs
Computing with Directed Labeled Graphs
 
A Document Exploring System on LDA Topic Model for Wikipedia Articles
A Document Exploring System on LDA Topic Model for Wikipedia ArticlesA Document Exploring System on LDA Topic Model for Wikipedia Articles
A Document Exploring System on LDA Topic Model for Wikipedia Articles
 
Automatic Annotation Of Incomplete And Scattered Bibliographical References I...
Automatic Annotation Of Incomplete And Scattered Bibliographical References I...Automatic Annotation Of Incomplete And Scattered Bibliographical References I...
Automatic Annotation Of Incomplete And Scattered Bibliographical References I...
 
TOPIC EXTRACTION OF CRAWLED DOCUMENTS COLLECTION USING CORRELATED TOPIC MODEL...
TOPIC EXTRACTION OF CRAWLED DOCUMENTS COLLECTION USING CORRELATED TOPIC MODEL...TOPIC EXTRACTION OF CRAWLED DOCUMENTS COLLECTION USING CORRELATED TOPIC MODEL...
TOPIC EXTRACTION OF CRAWLED DOCUMENTS COLLECTION USING CORRELATED TOPIC MODEL...
 
Topic modelling
Topic modellingTopic modelling
Topic modelling
 
Model of semantic textual document clustering
Model of semantic textual document clusteringModel of semantic textual document clustering
Model of semantic textual document clustering
 
SFScon18 - Gabriele Sottocornola - Probabilistic Topic Models with MALLET
SFScon18 - Gabriele Sottocornola - Probabilistic Topic Models with MALLETSFScon18 - Gabriele Sottocornola - Probabilistic Topic Models with MALLET
SFScon18 - Gabriele Sottocornola - Probabilistic Topic Models with MALLET
 
The Geometry of Learning
The Geometry of LearningThe Geometry of Learning
The Geometry of Learning
 

Más de Sergey Sosnovsky

Toward Eliminating Hallucinations: GPT-based Explanatory AI for Intelligent T...
Toward Eliminating Hallucinations: GPT-based Explanatory AI for Intelligent T...Toward Eliminating Hallucinations: GPT-based Explanatory AI for Intelligent T...
Toward Eliminating Hallucinations: GPT-based Explanatory AI for Intelligent T...Sergey Sosnovsky
 
Layout- and Activity-based Textbook Modeling for Automatic PDF Textbook Extra...
Layout- and Activity-based Textbook Modeling for Automatic PDF Textbook Extra...Layout- and Activity-based Textbook Modeling for Automatic PDF Textbook Extra...
Layout- and Activity-based Textbook Modeling for Automatic PDF Textbook Extra...Sergey Sosnovsky
 
Exploring the Content Ecosystem of the First Open-source Adaptive Tutor and i...
Exploring the Content Ecosystem of the First Open-source Adaptive Tutor and i...Exploring the Content Ecosystem of the First Open-source Adaptive Tutor and i...
Exploring the Content Ecosystem of the First Open-source Adaptive Tutor and i...Sergey Sosnovsky
 
Advancing Intelligent Textbooks with Automatically Generated Practice: A Larg...
Advancing Intelligent Textbooks with Automatically Generated Practice: A Larg...Advancing Intelligent Textbooks with Automatically Generated Practice: A Larg...
Advancing Intelligent Textbooks with Automatically Generated Practice: A Larg...Sergey Sosnovsky
 
Creating Session Data from eTextbook Event Streams
Creating Session Data from eTextbook Event StreamsCreating Session Data from eTextbook Event Streams
Creating Session Data from eTextbook Event StreamsSergey Sosnovsky
 
Augmenting Digital Textbooks with Reusable Smart Learning Content: Solutions ...
Augmenting Digital Textbooks with Reusable Smart Learning Content: Solutions ...Augmenting Digital Textbooks with Reusable Smart Learning Content: Solutions ...
Augmenting Digital Textbooks with Reusable Smart Learning Content: Solutions ...Sergey Sosnovsky
 
Interactions of reading and assessment activities
Interactions of reading and assessment activitiesInteractions of reading and assessment activities
Interactions of reading and assessment activitiesSergey Sosnovsky
 
Parallel Construction: A Parallel Corpus Approach for Automatic Question Gene...
Parallel Construction: A Parallel Corpus Approach for Automatic Question Gene...Parallel Construction: A Parallel Corpus Approach for Automatic Question Gene...
Parallel Construction: A Parallel Corpus Approach for Automatic Question Gene...Sergey Sosnovsky
 
YAI4Edu: an Explanatory AI to Generate Interactive e-Books for Education
YAI4Edu: an Explanatory AI to Generate Interactive e-Books for EducationYAI4Edu: an Explanatory AI to Generate Interactive e-Books for Education
YAI4Edu: an Explanatory AI to Generate Interactive e-Books for EducationSergey Sosnovsky
 
Automatic Question Generation for Evidence-based Online Courseware Engineering
Automatic Question Generation for Evidence-based Online Courseware EngineeringAutomatic Question Generation for Evidence-based Online Courseware Engineering
Automatic Question Generation for Evidence-based Online Courseware EngineeringSergey Sosnovsky
 
Reading Comprehension Quiz Generation using Generative Pre-trained Transformers
Reading Comprehension Quiz Generation using Generative Pre-trained TransformersReading Comprehension Quiz Generation using Generative Pre-trained Transformers
Reading Comprehension Quiz Generation using Generative Pre-trained TransformersSergey Sosnovsky
 
Mathematical Language Processing via Tree Embeddings
Mathematical Language Processing via Tree EmbeddingsMathematical Language Processing via Tree Embeddings
Mathematical Language Processing via Tree EmbeddingsSergey Sosnovsky
 
Contextual Definition Generation
Contextual Definition GenerationContextual Definition Generation
Contextual Definition GenerationSergey Sosnovsky
 
Transforming Textbooks into Learning by Doing Environments: An Evaluation of ...
Transforming Textbooks into Learning by Doing Environments: An Evaluation of ...Transforming Textbooks into Learning by Doing Environments: An Evaluation of ...
Transforming Textbooks into Learning by Doing Environments: An Evaluation of ...Sergey Sosnovsky
 
Generation of Assessment Questions from Textbooks Enriched with Knowledge Models
Generation of Assessment Questions from Textbooks Enriched with Knowledge ModelsGeneration of Assessment Questions from Textbooks Enriched with Knowledge Models
Generation of Assessment Questions from Textbooks Enriched with Knowledge ModelsSergey Sosnovsky
 
Using Semantics of Textbook Highlights to Predict Student Comprehension and K...
Using Semantics of Textbook Highlights to Predict Student Comprehension and K...Using Semantics of Textbook Highlights to Predict Student Comprehension and K...
Using Semantics of Textbook Highlights to Predict Student Comprehension and K...Sergey Sosnovsky
 
Dental TutorBot: Exploitation of Dental Textbooks for Automated Learning
Dental TutorBot: Exploitation of Dental Textbooks for Automated LearningDental TutorBot: Exploitation of Dental Textbooks for Automated Learning
Dental TutorBot: Exploitation of Dental Textbooks for Automated LearningSergey Sosnovsky
 
Using Programmed Instruction to Help Students Engage with eTextbook Content
Using Programmed Instruction to Help Students Engage with eTextbook Content Using Programmed Instruction to Help Students Engage with eTextbook Content
Using Programmed Instruction to Help Students Engage with eTextbook Content Sergey Sosnovsky
 
Adding Intelligence to a Textbook for Human Anatomy with a Causal Concept Map...
Adding Intelligence to a Textbook for Human Anatomy with a Causal Concept Map...Adding Intelligence to a Textbook for Human Anatomy with a Causal Concept Map...
Adding Intelligence to a Textbook for Human Anatomy with a Causal Concept Map...Sergey Sosnovsky
 
Interlingua: Linking Textbooks Across Different Languages
Interlingua: Linking Textbooks Across Different Languages Interlingua: Linking Textbooks Across Different Languages
Interlingua: Linking Textbooks Across Different Languages Sergey Sosnovsky
 

Más de Sergey Sosnovsky (20)

Toward Eliminating Hallucinations: GPT-based Explanatory AI for Intelligent T...
Toward Eliminating Hallucinations: GPT-based Explanatory AI for Intelligent T...Toward Eliminating Hallucinations: GPT-based Explanatory AI for Intelligent T...
Toward Eliminating Hallucinations: GPT-based Explanatory AI for Intelligent T...
 
Layout- and Activity-based Textbook Modeling for Automatic PDF Textbook Extra...
Layout- and Activity-based Textbook Modeling for Automatic PDF Textbook Extra...Layout- and Activity-based Textbook Modeling for Automatic PDF Textbook Extra...
Layout- and Activity-based Textbook Modeling for Automatic PDF Textbook Extra...
 
Exploring the Content Ecosystem of the First Open-source Adaptive Tutor and i...
Exploring the Content Ecosystem of the First Open-source Adaptive Tutor and i...Exploring the Content Ecosystem of the First Open-source Adaptive Tutor and i...
Exploring the Content Ecosystem of the First Open-source Adaptive Tutor and i...
 
Advancing Intelligent Textbooks with Automatically Generated Practice: A Larg...
Advancing Intelligent Textbooks with Automatically Generated Practice: A Larg...Advancing Intelligent Textbooks with Automatically Generated Practice: A Larg...
Advancing Intelligent Textbooks with Automatically Generated Practice: A Larg...
 
Creating Session Data from eTextbook Event Streams
Creating Session Data from eTextbook Event StreamsCreating Session Data from eTextbook Event Streams
Creating Session Data from eTextbook Event Streams
 
Augmenting Digital Textbooks with Reusable Smart Learning Content: Solutions ...
Augmenting Digital Textbooks with Reusable Smart Learning Content: Solutions ...Augmenting Digital Textbooks with Reusable Smart Learning Content: Solutions ...
Augmenting Digital Textbooks with Reusable Smart Learning Content: Solutions ...
 
Interactions of reading and assessment activities
Interactions of reading and assessment activitiesInteractions of reading and assessment activities
Interactions of reading and assessment activities
 
Parallel Construction: A Parallel Corpus Approach for Automatic Question Gene...
Parallel Construction: A Parallel Corpus Approach for Automatic Question Gene...Parallel Construction: A Parallel Corpus Approach for Automatic Question Gene...
Parallel Construction: A Parallel Corpus Approach for Automatic Question Gene...
 
YAI4Edu: an Explanatory AI to Generate Interactive e-Books for Education
YAI4Edu: an Explanatory AI to Generate Interactive e-Books for EducationYAI4Edu: an Explanatory AI to Generate Interactive e-Books for Education
YAI4Edu: an Explanatory AI to Generate Interactive e-Books for Education
 
Automatic Question Generation for Evidence-based Online Courseware Engineering
Automatic Question Generation for Evidence-based Online Courseware EngineeringAutomatic Question Generation for Evidence-based Online Courseware Engineering
Automatic Question Generation for Evidence-based Online Courseware Engineering
 
Reading Comprehension Quiz Generation using Generative Pre-trained Transformers
Reading Comprehension Quiz Generation using Generative Pre-trained TransformersReading Comprehension Quiz Generation using Generative Pre-trained Transformers
Reading Comprehension Quiz Generation using Generative Pre-trained Transformers
 
Mathematical Language Processing via Tree Embeddings
Mathematical Language Processing via Tree EmbeddingsMathematical Language Processing via Tree Embeddings
Mathematical Language Processing via Tree Embeddings
 
Contextual Definition Generation
Contextual Definition GenerationContextual Definition Generation
Contextual Definition Generation
 
Transforming Textbooks into Learning by Doing Environments: An Evaluation of ...
Transforming Textbooks into Learning by Doing Environments: An Evaluation of ...Transforming Textbooks into Learning by Doing Environments: An Evaluation of ...
Transforming Textbooks into Learning by Doing Environments: An Evaluation of ...
 
Generation of Assessment Questions from Textbooks Enriched with Knowledge Models
Generation of Assessment Questions from Textbooks Enriched with Knowledge ModelsGeneration of Assessment Questions from Textbooks Enriched with Knowledge Models
Generation of Assessment Questions from Textbooks Enriched with Knowledge Models
 
Using Semantics of Textbook Highlights to Predict Student Comprehension and K...
Using Semantics of Textbook Highlights to Predict Student Comprehension and K...Using Semantics of Textbook Highlights to Predict Student Comprehension and K...
Using Semantics of Textbook Highlights to Predict Student Comprehension and K...
 
Dental TutorBot: Exploitation of Dental Textbooks for Automated Learning
Dental TutorBot: Exploitation of Dental Textbooks for Automated LearningDental TutorBot: Exploitation of Dental Textbooks for Automated Learning
Dental TutorBot: Exploitation of Dental Textbooks for Automated Learning
 
Using Programmed Instruction to Help Students Engage with eTextbook Content
Using Programmed Instruction to Help Students Engage with eTextbook Content Using Programmed Instruction to Help Students Engage with eTextbook Content
Using Programmed Instruction to Help Students Engage with eTextbook Content
 
Adding Intelligence to a Textbook for Human Anatomy with a Causal Concept Map...
Adding Intelligence to a Textbook for Human Anatomy with a Causal Concept Map...Adding Intelligence to a Textbook for Human Anatomy with a Causal Concept Map...
Adding Intelligence to a Textbook for Human Anatomy with a Causal Concept Map...
 
Interlingua: Linking Textbooks Across Different Languages
Interlingua: Linking Textbooks Across Different Languages Interlingua: Linking Textbooks Across Different Languages
Interlingua: Linking Textbooks Across Different Languages
 

Último

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 

Último (20)

Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 

What's in a textbook

  • 2. Architecture of an AES Instructional Content Interaction User Model 0..1..1. .0..1..1 ..Adaptation Model Adaptation M e t a d a t a Domain Model 2 2
  • 3. Math-Bridge: Rich Adaptive and Intelligent Textbooks Seite/Page 3 Sosnovsky, S., Dietrich, M., Andrès, E., Goguadze, G., Winterstein, S., Libbrecht, P., Siekmann, J., & Melis, E. (2014). Math-Bridge: Bridging the gaps in European remedial mathematics with technology-enhanced learning. In T. Wassong, D. Frischemeier, P. R. Fischer, R. Hochmuth, & P. Bender (Eds.), Mit Werkzeugen Mathematik und Stochastik lernen – Using Tools for Learning Mathematics and Statistics (pp. 437-451). Berlin/Heidelberg, Germany: Springer.
  • 7. Metadata annotation Metadata annotation error- prone time- consu- ming limited support of tools often many authors often non expert authors difficult Seite/Page 7 •Math-Bridge metadata schema has more than 30 elements •Math-Bridge content collection contains more than 10 000 learning objects •About 50 people were involved in preparing this collection
  • 8. The Burden of Authoring §Learning content authoring has always been Tedious, Expertise demanding, Poorly supported §Content & Knowledge authoring for Adaptive Intelligent Systems requires a lot of extra efforts §!!! Information & Knowledge existing in the system should become not the authoring burden but the vehicle for authoring support !!! Seite/Page 8 Instructional Content Authoring for e-Learning Metadata InstructionalContent Authoring for Adaptive e- Learning Instructional Content Authoring for Adaptive e- Learning as It Should Be
  • 9. Semantic Gap Detection F O U R M A I N S T E P S : Conversion of Metadata to OWL2 Detection of Ontology Inconsistencies Isolation of Causing Axioms Generation of Verbal Explanations Seite/Page 9 Sosnovsky, S., & Alpizar-Chacon, I. (2014). Semantic gap detection in metadata of adaptive learning environments. In Proceedings of ICALT'2014: 14th International Conference on Advanced Learning Technologies (pp. 548-552). IEEE Computer Society.
  • 11. Step 1: Conversion of Metadata to OWL2 OWL2 XSLT Stylesheet OMDoc Seite/Page 11
  • 12. Step 2: Detection of Ontology Inconsistencies rdfs:domain rdfs:range owl:ObjectProperty activemath: hasDomainPrerequisite intro_bikers_slope activemath:Text rdf:type activemath: KnowledgeItem ex_tour_de_fr activemath:Example rdf:type activemath:ConceptItem activemath: SateliteItem Inconsistent! Seite/Page 12
  • 13. Step 3: Isolation of Causing Axioms Seite/Page 13
  • 14. Step 4: Generation of Verbal Explanations Seite/Page 14
  • 15. The Scale of the Problem Interaction Adaptation 15
  • 16. Textbooks as a source of (extractable) knowledge • Focus (narrow, cohesive domain) • Quality (created by domain experts) • Purpose (content explains domain knowledge to a novice) 16 • sections / subsectionsStructure • easy to complexOrder • ..of content and headersFormatting • indices • tables of content Additional structural elements •Underlying content •Textual Labels Topics/subtopics •Prerequisites <-> outcomes Pedagogical relations •header vs important vs regular •same format = same role Text types/roles and relations •Glossary of curated meaningful terms •Set of important domain categories Meaningful labels • If automatically extracted and formally represented these elements will form the model of the textbook and the model of the domain as the author understands it
  • 17. Linking Textbooks to Ontologies 17 Topic-based model of an HTML-based Java textbook automatically extracted and mapped to a central ontology already linked to a set of Java exercises • Mapping serves as a bridge to jointly interpret learner’s reading and exercise attempts in terms of ontology and adapt access to textbook pages accordingly Project 1 1.Sosnovsky, S., Hsiao, I-H., & Brusilovsky, P. (2012). Adaptation “in the wild”: Ontology-based personalization of open-corpus learning material. In A. Ravenscroft, S. Lindstaedt, C. Delgado Kloos, & D. Hernández-Leo (Eds.), Proceedings of EC-TEL'2012: 7th European Conference on Technology Enhanced Learning (pp. 425-431). Berlin/Heidelberg, Germany: Springer.
  • 18. Linking Textbooks to Textbooks Several LDA-based techniques are used to interlink sections from a set of HTML-based textbooks in a domain A manual mapping by experts is used as a golden standard 19 Linking Linking Project 2 Guerra, J., Sosnovsky, S., & Brusilovsky, P. (2013). When one textbook is not enough: Linking multiple textbooks using probabilistic topic models. In D. Hernández-Leo, T. Ley, R., Klamma, & A. Harrer (Eds.), Proceedings of EC-TEL'2013: 8th European Conference on Technology Enhanced Learning (pp. 125-138). Berlin/Heidelberg, Germany: Springer.
  • 19. Interlingua: linking textbooks across languages Statistics ontology .... .... .... ! Semantic model of the textbook Project 3 DE Chapter1 Section1.1 Subsection1.1.1 Subsection1.1.2 … Section1.2 Subsection1.2.2 … term -> page# term -> page# term -> page# term -> page# term -> page# …. .... .... .... EN .... .... .... FR Alpizar-Chacon, I., van der Hart, M., Wiersma, Z., Theunissen, L., & Sosnovsky, S. (2020). Transformation of PDF Textbooks into Interactive Educational Resources. In Proceedings of the Workshop on Intelligent Textbooks at AIEd'2020 (pp. 4-16). Online, July 6, 2020.
  • 20. Relevant Content in One’s Mother Tongue Project 3
  • 21. intextbooks Isaac Alpizar Chacon Alpizar-Chacon, I., & Sosnovsky, S.(2020). Knowledge models from PDF textbooks. New Review of Hypermedia and Multimedia, (in press).
  • 22. Model extraction from PDF textbooks 24 PDF as the most common and challenging format 4 stages 9 steps 39 rules Alpizar-Chacon, I., & Sosnovsky, S. (2020). Order out of Chaos: Construction of Knowledge Models from PDF Textbooks. In Proceedings of DocEng’2020: The 20th ACM Symposium on Document Engineering, (Article No.: 8, pp 1–10). New York, NY, USA: ACM Press.
  • 23. 25 Example Rule • REPEATED_LINES: 1. Select a sample of pages: 𝑃 𝑠 = {𝑝𝑎 , 𝑝𝑏 , . . . , 𝑝𝑚 } | 𝑃𝑠 ⊂ 𝑃. 2. If the first line(s) are identical across 𝑃 𝑠 : header is detected and removed in all pages 𝑝 ∈ 𝑃. 3. If the last line(s) are identical across 𝑃 𝑠 : footer is detected and removed in all pages 𝑝 ∈ 𝑃.
  • 24. 2. Role labeling of fragments
  • 25. Style 1 Style 2 Style 3 Style 4 Style Font Family Font Size Font Face Font Color Occurrences 1 Liberation Sans 35 Bold Blue 3 2 Liberation Sans 18 Bold Blue 1 3 Liberation Sans 9 - Black 153 4 Liberation Sans 9 Bold Black 2 => Body text Chapter Subchapter 2. Role labeling of fragments
  • 26. 3. Processing Table of Contents
  • 27. TOC Section Textbook Part Chapter Subchapter Subchapter level 2 Subchapter Subchapter Subchapter Chapter . . . . . . . . . . . Individual page numbers for each section Subchapter level 2 Subchapter level 2 Subchapter level 2 Subchapter level 2 Subchapter level 2 3. Processing Table of Contents
  • 29. Multi-column layout Index Section Index term + page number "see" case Multiline term Nested Term Range of page numbers Reading order = 3. Processing Index
  • 31. Potential Problems of These Models • Structure • Labels • Order • Focus • Coverage Variability 33 • Same domain + Different authors = Different textbooks => Different models Subjectivity • Completeness • Granularity • Consistency Quality • More structure than knowledge • Lack of links • Cohesiveness of topics and index terms Lack of semantics Textbook-levelModel-level
  • 32. ..nevertheless • They are automatically extracted models of high- quality resources and underlying domains • Their individual quality might be not enough, but they can be aggregated • Linking models to the existing ontologies should help filter our less relevant terms and extend them with additional semantical information • Interlinking multiple models within the same domain should improve the coverage 34
  • 33. 35 Evaluation 1 (Accuracy of model extraction) Domains: Statistics, Computer Science, History, Literature
  • 34. 36 Evaluation 1 (Accuracy of model extraction): Results Averages over all domains Text Extraction Our approach: 93.85% PDFBox: 89.72% PdfAct: 84.19% TOC Recognition Precision: 99.92% Recall: 99.92% Index Recognition Precision: 98.56% Recall: 98.13%
  • 35. 37 Evaluation 2 (Value of Extracted Models – Semantic Linking of Textbooks) Book#1 Chap1 Sub1 Sub2 Chap2 Chap3 Book#2 Chap1 Sub1 Sub2 Sub3 Chap2 Chap3 Sub1 Sub2 Chap4 Book#1 Chap1 Sub1 Sub2 Chap2 Chap3 Book#2 Chap1 Sub1 Sub2 Sub3 Chap2 Chap3 Sub1 Sub2 Chap4
  • 36. 38 Evaluation 2 (Value of Extracted Models – Semantic Linking of Textbooks): Method • Ground truth • Average of manual linking of two textbooks by three experts in statistics • Measure: • NDCG (normalized discounted cumulative gain) at 1, 3, and 5. • Baselines: • TFIDF model • LDA model
  • 37. 39 0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 NDCG@1 NDCG@3 NDCG@5 TFIDF LDA TFIDF+LDA Our model Evaluation 2 (Value of Extracted Models – Semantic Linking of Textbooks): Results
  • 38. Model linking to 40 Alpizar-Chacon, I., & Sosnovsky, S. (2019). Expanding the Web of Knowledge: One Textbook at a Time. In Proceedings of ACM Hypertext’2019: 30th International Conference on Hypertext and Social Media (pp. 9-18). New York, NY, USA: ACM Press.
  • 39. 1. Construction of the Glossary 41 a) Index parsing b) Term recognition c) Glossary creation • Preparation for the next phase D .. Distribution Gamma Normal … Index Glossary terms (with candidate labels) Distribution 85 Gamma Distribution 106 Distribution Gamma Normal Distribution 92 Distribution Normal 92 106 Distribution 85 Gamma Distribution 106 Normal Distribution 92
  • 40. • We use index terms to query DBpedia => find matching resources • DBpedia resources can have categories (e.g. Statistics) • Categories form hierarchy (e.g., Statistics / Statistical_models / ...) • In the beginning, we select the target top category (define the domain) • The algorithm looks 2 more levels deeper • This is the only manual input required • If a query retrieves only 1 DBpedia resource and it belongs to one of target categories (dct:subject) this resource becomes the part of the core set • dbo:abstract’s of all core set resources are concatenated to form domain context (used at Step 2.c) 2.a Core set construction 42
  • 41. 2.b Candidate set construction • If a query retrieves several DBpedia resources they form the candidate set of the term • Context is gathered for every candidate resource: • dbo:abstract of this resource + • dbo:abstract’s of all resources linked to it • Context helps during the next step 43
  • 42. 2.c Resource disambiguation • For each resource from a candidate set • Cosine similarity is computed between the context of the resource and the domain context • The resource with the highest cosine similarity (and > threshold) is matched to the term • Newly obtained resources help to extend the domain context • Step 2.3 repeats until no more new terms can be matched 44
  • 43. 3. Model Enrichment • Abstract • Wikipedia link • Categories • Relation to other terms • Multilingual information • … 45 In statistics, the standard score is the (signed) number of standard deviations an observation or… standard score En probabilités et statistiques, une variable centrée réduite est une variable aléatoire… Unter Standardisierung oder z-Transformation versteht man in der mathematischen Statistik eine … Statistical Ratios http://en.wikipedia.org/wiki/Standard_score dct:subject FR DE EN t-statistics dct:subject …… yago:WikicatStatisticalRatios rdf:type
  • 44. 4603-12-2020 TEI Textbook Model Structure (sections) Content (words, lines, titles, etc) Domain Knowledge (terms) + RDFa attributes
  • 45. Evaluation: Linking to DBpedia • Question: Are the index terms linked to the right DBpedia resources? • Task: validate the resources disambiguation procedure • BL1 (random baseline): a random resources in the candidates list is selected as the right resource • BL2 (default sense baseline): the most linked/popular resource in the candidate list is selected as the right resource • Ground truth was created manually 47 Statistics#1 Statistics#2 Information Retrieval
  • 46. Evaluation: Aggregation of Models • Question: Would aggregation of additional textbooks move the model closer to the ideal domain model (all relevant resources)? • Ground truth: constructed based on the Glossary of statistical terms • > 1000 terms • Task: compare the matching between textbooks and DBpedia with the “ideal” matching between the Glossary and the DBpedia 48 Average single textbook Average 5 textbooks 10 textbooks
  • 47. Transformation of PDF textbooks into interactive HTML Structure (sections) Content (words, lines, titles, etc) Domain Knowledge (terms) + RDFa attributes Alpizar-Chacon, I., van der Hart, M., Wiersma, Z., Theunissen, L., & Sosnovsky, S. (2020). Transformation of PDF Textbooks into Interactive Educational Resources. In Proceedings of the Workshop on Intelligent Textbooks at AIEd'2020 (pp. 4-16). Onlines, July 6, 2020.
  • 48. 5003-12-2020 PDF to HTML converter • Several open libraries available: • pdf2htmlEX, PDFMiner, pdf2html, Xpdf, etc. • pdf2htmlEX: • preserves the layout perfectly across very different types of documents • produces the same structure across different documents • fast, stable, and scalable
  • 51. 5303-12-2020 Validation Test the accuracy of the matching algorithm for the TEI-HTML synchronization 70 university-level textbooks domains: statistics, computer science, web programming, literature, history evaluation metric: percentage of words that were matched between the TEI and HTML representations Results: 87-90 %
  • 52. Current Work (1): Extraction of accurate domain models from textbook indices • Index entries have different roles (different domain specificity): - introduce core domain terms <hypotheses testing> - introduce related domain terms <factorial>, <sample space> - serve various pedagogically purposes (examples, use-cases, data, etc.) <Euro coin>, <Bovine Spongiform Encephalopathy> 54
  • 53. Current Work (1): Extraction of accurate domain models from textbook indices Approach: 1. Use DBPedia to infer the domain specificity of matched index terms 2. Utilise DBPedia structure (categories and resources) and associated textual content 3. Integrate indices from multiple textbooks to discover a " better” domain model Domains: 1. Statistics 2. Classic Philosophy 55
  • 54. Current Work (2): From tables of contents to topics • Add rules for filtering out non-topical sections / TOC entries • Explore how hierarchy, order and labels of topics can help domain model extraction • Create a global table of contents of the domain from multiple textbooks • Personalised textbook generation 56
  • 55. Current Work (3): assessment generation • Use the rich intextbooks models (structured textual content annotated with domain models, linked to DBPedia, linked to other textbooks) to • generate self-assessment questions on demand • targeting a specific subset of the model/content - adaptive assessment generation 57